In this lecture we continue with our introduction to neural networks. Specifically we will discuss how to train neural networks: i.e. the **Backpropagation Algorithm**

Lecture 03 training NNs (slides modified from Hugo Larochelle’s course notes)

**Reference: **(you are responsible for all of this material)

- Chapter 6 of the Deep Learning textbook (by Ian Goodfellow, Yoshua Bengio and Aaron Courville).

### Like this:

Like Loading...

Visualizing, playing around with a neural net and the type of classification you can achieve: playground.tensorflow.org

LikeLiked by 1 person

What do you think of bio-inspired chips for dNN? For example: http://www.research.ibm.com/articles/brain-chip.shtml

LikeLike

It is mentioned at slide 38 that mini-batch training can give a more accurate estimate of the gradient. However, doesn’t SGD with 1 sample have the same expected value as for any batch size? With what measure of accuracy is this true? I would understand if it stated that it lowers the variance of the gradient, maybe this is what it means?

LikeLike

We were discussing about the loss function for a neural network. We saw the L2 loss, and it was mentioned the L1 loss, as well as the cross-entropy loss.

I asked: Is the Infinite-Norm used in loss functions in the field of deep learning / neural networks?

Aaron mentioned that It is used: not for the loss function, but for regularization. In particular, in

Drop-out, there is the idea of regularizing the norm of W, by limiting the value of the infinity norm of the matrix (max-norm). This is explicitly mentioned, for example, in this paper: http://jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf

Note: This comment is only to leave trace in the blog of something that I asked in class. I frequently asked questions, but was very shy to come to the blog and post them.

LikeLike