How to realize variational automatic Encoder with pytorch 07/15 Update SLTechnology News&Howtos

How to realize variational automatic Encoder with pytorch

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

Editor to share with you how to achieve pytorch variational automatic encoder, I believe that most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!

This example is generated from the MNIST data set into an example #-*-coding: utf-8-* "Created on Fri Oct 12 11:42:19 2018@author: www"import os import torchfrom torch.autograd import Variableimport torch.nn.functional as Ffrom torch import nnfrom torch.utils.data import DataLoader from torchvision.datasets import MNISTfrom torchvision import transforms as tfsfrom torchvision.utils import save_image im_tfs = tfs.Compose ([tfs.ToTensor (), tfs.Normalize ([0.5,0.5,0.5]] [0.5,0.5,0.5]) # Standardization]) train_set = MNIST (transform=im_tfs) train_data = DataLoader (train_set, batch_size=128, shuffle=True) class VAE (nn.Module): def _ init__ (self): super (VAE, self). _ init__ () self.fc1 = nn.Linear (784,400) self.fc21 = nn.Linear (400) 20) # mean self.fc22 = nn.Linear (400,20) # var self.fc3 = nn.Linear (20,400) self.fc4 = nn.Linear (400,784) def encode (self, x): H2 = F.relu (self.fc1 (x)) return self.fc21 (H2), self.fc22 (H2) def reparametrize (self, mu Logvar): std = logvar.mul (0.5) .exp _ () eps = torch.FloatTensor (std.size ()) .normal _ () if torch.cuda.is_available (): eps = Variable (eps.cuda ()) else: eps = Variable (eps) return eps.mul (std) .add _ (mu) def decode (self) Z): H4 = F.relu (self.fc3 (z)) return F.tanh (self.fc4 (h4)) def forward (self, x): mu, logvar = self.encode (x) # Encoding z = self.reparametrize (mu, logvar) # re-parameterized to normal distribution return self.decode (z), mu, logvar # decoding At the same time, output mean variance net = VAE () # instantiated network if torch.cuda.is_available (): net = net.cuda () x, _ = train_set [0] x = x.view (x.shape [0],-1) if torch.cuda.is_available (): X = x.cuda () x = Variable (x) _, mu, var = net (x) print (mu) # you can see that for the input The network can output the mean and variance of implicit variables. The mean variance here has not been trained yet # Let's start training reconstruction_function = nn.MSELoss (size_average=False) def loss_function (recon_x, x, mu, logvar): "recon_x: generating images x: origin images mu: latent mean logvar: latent log variance"MSE = reconstruction_function (recon_x) X) # loss = 0.5 * sum (1 + log (sigma ^ 2)-mu2-sigma ^ 2) KLD_element = mu.pow (2) .add _ (logvar.exp ()) .mul _ (- 1) .add _ (1) .add _ (logvar) KLD = torch.sum (KLD_element) .mul _ (- 0.5) # KL divergence return MSE + KLD optimizer = torch.optim.Adam (net.parameters () Lr=1e-3) def to_img (x):''define a function to convert the final result back to the picture''x = 0. 5 * (x + 1.) X = x.clamp (0,1) x = x.view (x.shape [0], 1,28,28) return x for e in range (100): for im, _ in train_data: im = im.view (im.shape [0],-1) im = Variable (im) if torch.cuda.is_available (): im = im.cuda () recon_im, mu Logvar = net (im) loss = loss_function (recon_im, im, mu, logvar) / im.shape [0] # average loss optimizer.zero_grad () loss.backward () optimizer.step (e + 1)% 20 = = 0: print ('epoch: {}, Loss: {: .4f}' .format (e + 1) Loss.item ()) save = to_img (recon_im.cpu () .data) if not os.path.exists ('. / vae_img'): os.mkdir ('. / vae_img') save_image (save,'. / vae_img/image_ {} .png '.format (e + 1))

Add: a quick start to PyTorch deep learning-- variational automatic encoder

The variational encoder is the upgraded version of the automatic encoder, its structure is similar to the automatic encoder, and it is also composed of encoder and decoder.

Recall that the problem with automatic encoders is that they cannot generate pictures arbitrarily, because we have no way to construct hidden vectors ourselves, and we need to input and encode a picture to know what the implicit vector is. At this time, we can solve this problem through variational automatic encoders.

In fact, the principle is very simple, it only needs to add some restrictions in the coding process to force the generated implicit vector to roughly follow a standard normal distribution, which is the biggest difference between it and the general automatic encoder.

In this way, it is very easy for us to generate a new picture. We only need to give it a random implicit vector of standard normal distribution, so that we can generate the image we want through the decoder without first encoding an original picture.

Generally speaking, the implicit vector we get from encoder is not a standard normal distribution. In order to measure the similarity between the two distributions, we use KL divergence to represent the loss of the difference between the implicit vector and the standard normal distribution, and the other loss is still represented by the mean square error between the generated image and the original picture.

KL divergence's formula is as follows

Multiple parameters in order to avoid calculating the integral in KL divergence, we use the technique of multiple parameters, instead of generating one implicit vector at a time, but two vectors, one representing the mean value and the other representing the standard deviation. Here, after the implicit vector after our default coding obeys a normal distribution, we can first multiply the standard normal distribution by the standard deviation plus the mean to synthesize the normal distribution. Finally, loss hopes that the generated normal distribution can conform to a standard normal distribution, that is, the mean value is 0 and the variance is 1.

So in the end, we can define our loss as the following function, and we can get a total loss by summing the mean square error and KL divergence.

Def loss_function (recon_x, x, mu, logvar): "recon_x: generating images x: origin images mu: latent mean logvar: latent log variance"MSE = reconstruction_function (recon_x) X) # loss = 0.5 * sum (1 + log (sigma ^ 2)-Mu ^ 2-sigma ^ 2) KLD_element = mu.pow (2) .add _ (logvar.exp ()) .mul _ (- 1) .add _ (1) .add _ (logvar) KLD = torch.sum (KLD_element) .mul _ (- 0.5) # KL divergence return MSE + KLD

A brief explanation of variational automatic encoder with mnist data set

Import os import torchfrom torch.autograd import Variableimport torch.nn.functional as Ffrom torch import nnfrom torch.utils.data import DataLoader from torchvision.datasets import MNISTfrom torchvision import transforms as tfsfrom torchvision.utils import save_image im_tfs = tfs.Compose ([tfs.ToTensor (), tfs.Normalize ([0.5,0.5,0.5], [0.5,0.5,0.5]) # Standardization]) train_set = MNIST ('. / mnist', transform=im_tfs) train_data = DataLoader (train_set) Batch_size=128, shuffle=True) class VAE (nn.Module): def _ init__ (self): super (VAE, self). _ _ init__ () self.fc1 = nn.Linear (784,400) self.fc21 = nn.Linear (400,20) # mean self.fc22 = nn.Linear (400,20) # var self.fc3 = nn.Linear (20,400) self.fc4 = nn.Linear Def encode (self, x): H2 = F.relu (self.fc1 (x)) return self.fc21 (H2), self.fc22 (H2) def reparametrize (self, mu) Logvar): std = logvar.mul (0.5) .exp _ () eps = torch.FloatTensor (std.size ()) .normal _ () if torch.cuda.is_available (): eps = Variable (eps.cuda ()) else: eps = Variable (eps) return eps.mul (std) .add _ (mu) def decode (self) Z): H4 = F.relu (self.fc3 (z)) return F.tanh (self.fc4 (h4)) def forward (self, x): mu, logvar = self.encode (x) # Encoding z = self.reparametrize (mu, logvar) # re-parameterized to normal distribution return self.decode (z), mu, logvar # decoding At the same time output mean variance net = VAE () # instantiated network if torch.cuda.is_available (): net = net.cuda () x, _ = train_set [0] x = x.view (x.shape [0],-1) if torch.cuda.is_available (): X = x.cuda () x = Variable (x) _, mu Var = net (x) print (mu) Variable containing: Columns 0 to 9-0.0307-0.1439-0.0435 0.3472 0.0368-0.0339 0.0274-0.5608 0.0280 0.2742 Columns 10 to 19-0.6221-0.0894-0.0933 0.4241 0.1611 0.3267 0.5755-0.0237 0.2714-0.2806 [torch.cuda.FloatTensor of size 1x20 (GPU 0)]

As you can see, for the input, the network can output the mean and variance of the implied variables, and the mean and variance here have not been trained yet.

Reconstruction_function = nn.MSELoss (size_average=False) def loss_function (recon_x, x, mu, logvar): "recon_x: generating images x: origin images mu: latent mean logvar: latent log variance"MSE = reconstruction_function (recon_x) X) # loss = 0.5 * sum (1 + log (sigma ^ 2)-mu2-sigma ^ 2) KLD_element = mu.pow (2) .add _ (logvar.exp ()) .mul _ (- 1) .add _ (1) .add _ (logvar) KLD = torch.sum (KLD_element) .mul _ (- 0.5) # KL divergence return MSE + KLD optimizer = torch.optim.Adam (net.parameters () Lr=1e-3) def to_img (x):''define a function to convert the final result back to the picture''x = 0. 5 * (x + 1.) X = x.clamp (0,1) x = x.view (x.shape [0], 1,28,28) return x for e in range (100): for im, _ in train_data: im = im.view (im.shape [0],-1) im = Variable (im) if torch.cuda.is_available (): im = im.cuda () recon_im, mu Logvar = net (im) loss = loss_function (recon_im, im, mu, logvar) / im.shape [0] # average loss optimizer.zero_grad () loss.backward () optimizer.step (e + 1)% 20 = = 0: print ('epoch: {}, Loss: {: .4f}' .format (e + 1) Loss.data [0])) save = to_img (recon_im.cpu (). Data) if not os.path.exists ('. / vae_img'): os.mkdir ('. / vae_img') save_image (save,'. / vae_img/image_ {} .png '.format (e + 1)) epoch: 20, Loss: 61.5803 epoch: 40, Loss: 62.9573 epoch: 60, Loss: 63.4285 epoch: 80 Loss: 64.7138 epoch: 100, Loss: 63.3343

Although the variational automatic encoder works better than ordinary automatic encoders and limits the probability distribution of its output coding (code), it still generates loss by directly calculating the mean square error of the generated picture and the original picture, which is not good. In the generation countermeasure network, we will talk about the limitations of calculating loss in this way, and then introduce a new training method. It is to train the network by generating antagonistic training instead of directly comparing the mean square error of each pixel of the two images.

The above is all the contents of the article "how to realize the variational automatic Encoder in pytorch". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.