KEMBAR78
Variational AutoEncoder | PDF | Applied Mathematics | Statistical Theory
0% found this document useful (0 votes)
54 views21 pages

Variational AutoEncoder

Uploaded by

hamzajafri04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views21 pages

Variational AutoEncoder

Uploaded by

hamzajafri04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Variational AutoEncoder

Muhammad Atif Tahir


Introduction
• Two significant contributions in Deep learning based generative
models during recent years
• Variational Autoencoders
• Generative Adversarial Networks (GANs)
Few Symbols
Variational AutoEncoders
• Variational autoencoder was proposed in 2013 by Diederik P. Kingma and
Max Welling at Google and Qualcomm

• Variational Autoencoders (VAEs) are a fascinating model that combine


Bayesian statistics with deep neural networks. VAEs wear many hats and
bridge many different worlds

• In other words, a variational autoencoder (VAE) provides a probabilistic


manner for describing an observation in latent space

• Thus, rather than building an encoder that outputs a single value to


describe each latent state attribute, VAE, formulate encoder to describe a
probability distribution for each latent attribute
DP Kingma, M Welling, Auto-encoding variational bayes, arXiv preprint arXiv:1312.6114,
2013•arxiv.org, Cited by 32776 (5/2/24)
Variational AutoEncoders
Use in

• Deep neural networks,


• Bayesian statistical machines,
• Latent variable models,
• Maximum likelihood estimators,
• Dimensionality reducers, and
• Generative models
AutoEncoder vs Variational AutoEncoder
Variational AutoEncoder architecture diagram
Variational AutoEncoder architecture diagram
• The encoder will take each input image
• Encode it to two vectors that together define a multivariate normal
distribution in the latent space
• Some Notations
• z_mean: The mean point of the distribution
• z_log_var: The logarithm of the variance of each dimension
• Point z from the distribution defined by these values using the
following equation:
• z = z_mean + z_sigma * epsilon
where z_sigma = exp(z_log_var * 0.5) and epsilon ~ N(0,I)
AutoEncoders versus Variational AutoEncoders
Variational AutoEncoder architecture diagram
• VAE model modified standard autoencoders by modeling a distribution
as the encoder output as opposed to just a brittle vector of numbers

• They then sampled from the distribution during forward pass, and
utilized the re-parametrization trick allowing for backpropagation
through the sampling step

• Both re-parametrization trick and variational inference had been around


longer, but gained greater popularity due to its application to VAEs
Reparameterization Trick
• Rather than sample directly from a normal distribution with
parameters z_mean and z_log_var, epsilon can be sampled from a
standard normal distribution and then manually adjust the sample to
have the correct mean and variance
• This is known as the reparameterization trick, and it’s important as it
means gradients can backpropagate freely through the layer
• By keeping all of the randomness of the layer contained within the
variable epsilon, the partial derivative of the layer output with respect
to its input can be shown to be deterministic (i.e., independent of the
random epsilon)
• This is essential for backpropagation through the layer to be possible
VAE Network with and without reparameterization trick. 𝜙 representations the
distribution the network is trying to learn
Loss Function
• Traditional autoencoder only consisted of the reconstruction loss
between images and their attempted copies after being passed
through the encoder and decoder
• For variational autoencoder, KL Divergence is added as extra
component

• The sum is taken over all the dimensions in the latent space

• kl_loss is minimized to 0 when z_mean = 0 and z_log_var = 0 for all


dimensions. As these two terms start to differ from 0, kl_loss increases
• focusing only on reconstruction loss does allow
us to separate out the classes (in this case,
MNIST digits) which should allow our decoder
model the ability to reproduce the original
handwritten digit, but there's an uneven
distribution of data within the latent space
• In other words, there are areas in latent space
which don't represent any of our observed data

Image Credit
• end up describing every observation
using the same unit Gaussian

• This effectively treats every observation


as having the same characteristics; in
other words, we've failed to describe the
original data

Image Credit
• However, when the two terms are
optimized simultaneously
• The latent state is described for an
observation with distributions close
to the prior but deviating when
necessary to describe salient features
of the input

Image Credit
Summary

• The encoder-decoder architecture lies at the heart of Variational


Autoencoders (VAEs), distinguishing them from traditional
autoencoders

• The encoder network takes raw input data and transforms it into a
probability distribution within the latent space

• The latent code generated by the encoder is a probabilistic encoding,


allowing the VAE to express not just a single point in the latent space
but a distribution of potential representations
Summary (Continue)
• The decoder network, in turn, takes a sampled point from the latent
distribution and reconstructs it back into data space
• During training, the model refines both the encoder and decoder
parameters to minimize the reconstruction loss – the disparity between
the input data and the decoded output
• The goal is not just to achieve accurate reconstruction but also to
regularize the latent space, ensuring that it conforms to a specified
distribution
• The process involves a delicate balance between two essential
components: the reconstruction loss and the regularization term, often
represented by the Kullback-Leibler divergence
Summary (Continue)
• The reconstruction loss compels the model to accurately reconstruct
the input, while the regularization term encourages the latent space to
adhere to the chosen distribution, preventing overfitting and promoting
generalization

• By iteratively adjusting these parameters during training, the VAE


learns to encode input data into a meaningful latent space
representation. This optimized latent code encapsulates the underlying
features and structures of the data, facilitating precise reconstruction

• The probabilistic nature of the latent space also enables the generation
of novel samples by drawing random points from the learned
distribution
References
• https://www.linkedin.com/pulse/understanding-variational-autoencoders-
vaes-how-useful-raja
• https://www.analyticsvidhya.com/blog/2021/04/generate-your-own-
dataset-using-gan/
• https://www.geeksforgeeks.org/variational-autoencoders/
• https://medium.com/retina-ai-health-inc/variational-inference-derivation-
of-the-variational-autoencoder-vae-loss-function-a-true-story-
3543a3dc67ee
• https://towardsdatascience.com/reparameterization-trick-126062cfd3c3

You might also like