Machine Learning: Variational AutoEncoders(VAE) (40)
Table of contents
Go to this website and check the Appendix for the Convolutional Variational Autoencoder architecture.
We have input, then variational autoencoder and then output.
But what is this Variational Autoencoder doing?
Well what it's actually doing is the Variational Autoencoder is the component of our model that creates the dream.
So, where is the variational autoencoder?
Here it is
The idea behind this model is that we are allowing our artificial intelligence to dream and then come up with solutions in its dreams.
Basically an autoencoder allows you to compress these in this environment from billions parameters, really reduce the parameters and extract, so extract the features that really matter, that really matter for this task at hand.
Here the , z we can see is not fixed. Depending on this, the Reconstructed frame changes.
Try pressing this “Randomize z” and you will see difference Reconstructed frames like these
etc. And that allows us for more diverse trainings.
Let’s understand what happens in a neural network:
This is the sparse autoencoder we learned earlier. We can represent it like this
What is the difference between a normal auto encoder and a variational auto encoder?
In a variational encoder we want to add some stochasticity over here. So we wanna add some randomness to this latent vector.
We're going to take our inputs and we're going to map them onto a distribution. That will allow us to have randomness or stochasticity in our latent vector, and reconstruct many different environments, or have some randomness every time we decode this vector.
How to do it?
Rather than mapping straight to the latent vector, we're going to first map our inputs onto two vectors.
One of them is going to be the mean vector, so that's the mean of our distribution,
and then we're going to have the standard deviation vector.
then from these two vectors we're going to sample our latent vector every single time. This is called the sampled latent vector.
after this is all trained, we're going to be able to have a slightly varying latent vector every single time we generate it, and every single time we decode, allowing us to achieve that stochasticity that we're looking for.
Good to go!
But what will happen when we back propagate?
As latent vector has stochatiscity, we can’t run back propagation.
So that's a problem and how do we resolve this problem?
Reparameterization trick
Here we will use Reparameterization Trick. What are we going to do is we're going to take this latent vector, sampled latent vector and we're going to represent it in the following way.
Z (our sampled latent vector) is actually going to be the combination of the mean vector u, plus the standard deviation sigma, multiplied by epsilon; where epsilon is actually going to be the stochastic node.
On the left, we can see the simplest form of variational auto encoder which can’t use back propagation as Z introduces randomness.
On the right, we're actually separating out the sampling into its own separate node. So epsilon is just going to be a way for us to introduce randomness into this equation.
Now, in the right one, if we back propagate, z is not generating randomness. There we can easily back propagate.
Read more
Irhum Shafkat, 2018, Intuitively Understanding Variational Autoencoders
Diederik P. Kingma and Max Welling, 2014, Auto-Encoding Variational Bayes
Here is the code for Variational Auto encoder
Check this code