# Q16 – Linear RNN Dynamics

Consider the behavior of a linear RNN:
$h_t = W h_{t-1} + U x_{t} + b$

1.  Write $h_t$ as a function of $h_0$.
2.  Write out $\frac{d h_t}{d h_0}$.
3.  What happens when $t \to \infty$? Under what conditions?

## 5 thoughts on “Q16 – Linear RNN Dynamics”

1. Notice that – if I am not mistaken – notations U and W (used in the Deeplearning Textbook as well) are here swapped compared to the slides on RNN (lecture 5).

Like

2. 1) ht = W^t h0 + W^(t-1) ( UXt1 + b) + W^(t-2) (UXt2 + b) + … + W ( Uxt +b)
2) dht/dh0 = W^t
3) if elements in W are 1 then gradient explosion

Is this right or am I missing something?

Like

• gagnonlg says:

For 3), I think you can refer to sec. 10.7 of the textbook – rouhgly, W can be eigendecomposed into
W = Q Lambda Q^T, and W^t = Q Lambda^t Q^T. So any feature aligned with an eigenvalue 1, the gradient will explode.

Like

• gagnonlg says:

Part of my previous comment got messed up:

So any feature aligned with an eigenvalue less than 1 will vanish, and if any eigenvalue is greater than 1, the gradient will explode.

Liked by 3 people

• Tim Cooijmans says:

3 is tricky. An eigenvalue greater than one doesn’t imply explosions, e.g. if h0 is orthogonal to all the eigenvectors with eigenvalue greater than one.

Like