World's most popular travel blog for travel bloggers.

Why update weights and biases after training a Neural Network on whole set of training samples

, , No Comments
Problem Detail: 

I am reading the book Neural Networks and Deep Learning by Micheal Nielsen. In the second chapter of his book, he describes the following algorithm for updating weights and biases for a neural network:

enter image description here

In the 2nd step, the algorithm computes the error for a sample and backpropogates it through the layers. This happens for all of the training samples (please correct me if I am wrong because this is what I assume is going on). Then in the 3rd step, the weights and biases are updated. My question is, why does the algorithm wait until the third step to update the weights and biases. That is, why can't it do it after each training sample instead of after a whole set of training samples?

Thank you so much for your help and I appreciate it.

Asked By : user5139637
Answered By : D.W.

You are right. While you could backpropagate for all samples and then update the weights, you don't have to. Alternatively, you can iterate through the samples and, for each sample, backpropagate for just that sample and then update the weights.

The latter turns out to be more effective (empirically). It is called stochastic gradient descent. Today, most neural networks are trained using stochastic gradient descent, because it reaches convergence faster than the alternative. This makes neural networks faster to train.

Anyway, you could do it either way. Both ways are valid. The stochastic gradient descent approach seems to be faster in practice, but neither one is wrong.

Why did the book describe it this way? That's a pedagogical choice. The author might have considered it more natural or easier to explain this way.

Best Answer from StackOverflow

Question Source :

3200 people like this

 Download Related Notes/Documents


Post a Comment

Let us know your responses and feedback