# 18/20 – Variational Autoencoders

In this lecture we will finish up our discussion of sparse coding and start our discussion of variational autoencoders (VAEs). VAEs are the first of the generative models that we will study. We will see how they modify the standard autoencoder reconstruction loss to create a well-defined generative model with clear probabilistic semantics.

Slides:

Reference: (* = you are responsible for this material)

• *Sections 20.10.1-20.10.3 of the Deep Learning textbook.
• Diederik P Kingma, Max Welling, Auto-Encoding Variational Bayes published in the International Conference on Learning Representations (ICLR) 2014.
• Other reference are provided in the slides.

## 5 thoughts on “18/20 – Variational Autoencoders”

1. CW says:

the other day Aaron mentioned that when using VAE the variance of the latent variable tends not to vary a lot, so I did a little experiment on mnist. The result is presented here. Overall, with the variance (sigma) almost fixed among datapoints, the standard deviation of the mean determines the variance of the kl divergence. The components with high std of mean correspond to the non-zero-kl ones, and also to the ones with smaller variance (we are more confident about those components..?).

Like

• I am not sure to understand your question but as far as I understood from what Aaron said is that the model picks a suitable dimensionality of the latent variable to reconstruct the exples and sets others to the prior. So this is why kl goes to zero for those dimensions.

Like

• CW says:

I think the point of this figure is best summarised by the upper right one, where we can see the variance is roughly the same across data points. And it is true that kl goes to zero for some components, which is reflected by the larger variability of the mean and smaller variance in general for those components. If you think of this in a ‘posterior’ sense, we are trying to infer the probability of z given an observation of x, and the issue now is that the variance of the (approximate) posterior stays the same no matter what we have observed.

Like