24 – Attention and Memory

In this lecture, Dzmitry (Dima) Bahdanau will discuss attention and memory in neural networks.

Slides:

Reference: (* = you are responsible for this material)

Advertisements

One thought on “24 – Attention and Memory

  1. If some of you are interested, there is this paper, “Frustratingly Short Attention Spans in Neural Language Modeling” (https://arxiv.org/pdf/1702.04521.pdf). Briefly, in the context of language modeling, instead of using the whole state of the model for the attention look-up AND the output, they split the state in a key-value tuple. The key is only used for the attention, and the value is only used for the output.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s