Consider the behavior of a linear RNN:

1. Write as a function of .

2. Write out .

3. What happens when ? Under what conditions?

Advertisements

Skip to content
# IFT6266 – H2017 Deep Learning

## A Graduate Course Offered at Université de Montréal

# Q16 – Linear RNN Dynamics

##
5 thoughts on “Q16 – Linear RNN Dynamics”

### Leave a Reply

Consider the behavior of a linear RNN:

1. Write as a function of .

2. Write out .

3. What happens when ? Under what conditions?

Advertisements

%d bloggers like this:

Notice that – if I am not mistaken – notations U and W (used in the Deeplearning Textbook as well) are here swapped compared to the slides on RNN (lecture 5).

LikeLike

1) ht = W^t h0 + W^(t-1) ( UXt1 + b) + W^(t-2) (UXt2 + b) + … + W ( Uxt +b)

2) dht/dh0 = W^t

3) if elements in W are 1 then gradient explosion

Is this right or am I missing something?

LikeLike

For 3), I think you can refer to sec. 10.7 of the textbook – rouhgly, W can be eigendecomposed into

W = Q Lambda Q^T, and W^t = Q Lambda^t Q^T. So any feature aligned with an eigenvalue 1, the gradient will explode.

LikeLike

Part of my previous comment got messed up:

So any feature aligned with an eigenvalue less than 1 will vanish, and if any eigenvalue is greater than 1, the gradient will explode.

LikeLiked by 3 people

3 is tricky. An eigenvalue greater than one doesn’t imply explosions, e.g. if h0 is orthogonal to all the eigenvectors with eigenvalue greater than one.

LikeLike