Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Denoising Variational Autoencoder

License

NotificationsYou must be signed in to change notification settings

dojoteef/dvae

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

The purpose of this project is to compare a different method of applyingdenoising criterion to a variational autoencoder model. A slightly differentapproach has previously been implemented as an explicit corruption of the inputas would be done for a traditional denoising autoencoder (DAE), but appliedit to a variational autoencoder (VAE) (Im et al., 2016 [1]). In thisproject the output of the generative network of the VAE is treated as adistorted input for the DAE, with the loss propogated back to the VAE, which isreferred to as the denoising variational autoencoder (DVAE).

Methods

This project was created inTensorflow (version 0.12.1) partly as a way tofamiliarize myself with it.Tensorflow should be the only requirement forrunning the underlying code.

Datasets

There are two test datasets used for this project, theMNIST dataset and theCIFAR-10 dataset.

Models

For both the VAE and the DAE the recognition model is composed of aconvolutional neural network (CNN) with a batch normalization layer before eachactivation function. The generative model similarly uses batch normalization,but is a deconvolutional network (DN). The VAE uses a latent space of 50samples for theMNIST dataset and 100 samples for theCIFAR-10 dataset.

Training procedure

The input from the dataset is feed into a VAE. The variational lower bound isoptimized to minimize the KL-divergence (KLD) along with the expectedreconstruction error. The output of the generative network of the VAE is thenfed to a DAE which treats the generated image as a distorted input and tries tominimize the reconstruction error based on the original input.

The joint optimization of these two networks results in the loss propagatingfrom the DAE to the VAE to help drive down the variational lower bound.

Results

In order to determine if adding a denoising criterion improved performance, thenormal VAE was compared to the DVAE. This was performed on both datasets.

For theMNIST dataset there are 60000 training images, of which 6000 wereused as a hold out validation set, and there are 10000 test images.

For theCIFAR-10 dataset there are 50000 training images, of which 5000 wereused as a hold out validation set, and there are 10000 test images.

Model Loss and KL-divergence

ModelDatasetSamplesTesting LossTesting KLD
VAEMNIST174.24820.900
DVAEMNIST171.64825.728
VAEMNIST568.78614.333
DVAEMNIST565.13219.581
VAECIFAR-1011811.56235.668
DVAECIFAR-1011790.81652.144
VAECIFAR-1051787.75824.735
DVAECIFAR-1051783.65735.619

As an additional way to measure performance of the resultant models a separateCNN classification model was generated using a softmax activation and the crossentropy loss. The images generated from the VAE and DVAE were fed to theclassification model to determine the classification error rate of the twoautoencoder networks.

CNN Classification Error

DatasetTesting Error Rate
MNIST1.01%
CIFAR-1022.23%

Generated Image Classification Error

ModelDatasetSamplesTesting Error Rate
VAEMNIST13.85%
DVAEMNIST12.99%
VAEMNIST52.16%
DVAEMNIST51.66%
VAECIFAR-10171.84%
DVAECIFAR-10170.61%
VAECIFAR-10571.76%
DVAECIFAR-10570.61%

MNIST

Here are the graphs of the loss and KL divergence on the validation set overtime during training of the VAE and DVAE models on the MNIST dataset(samples=1):

MNIST Loss

MNIST loss

MNIST KL Divergence

MNIST kld

These are example inputs and outputs from the VAE and DVAE models on thetesting dataset (samples=1):

InputVAE OutputDVAE Output
MNIST inputMNIST VAEMNIST DVAE

CIFAR-10

Here are the graphs of the loss and KL divergence on the validation set overtime during training of the VAE and DVAE models on the CIFAR-10 dataset(samples=1):

CIFAR-10 Loss

CIFAR-10 loss

CIFAR-10 KL Divergence

CIFAR-10 kld

These are example inputs and outputs from the VAE and DVAE models on theCIFAR-10 testing dataset (samples=1):

InputVAE OutputDVAE Output
CIFAR-10 inputCIFAR-10 VAECIFAR-10 DVAE

Conclusion

By looking at the results it is clear that there is contention between reducingthe KLD and reducing the overall loss. This is apparent in the fact that theloss for the DVAE model is lower than the loss for the VAE model, while the KLDfor the DVAE model is higher than it is for the VAE model.

The reasons for this discrepancy is likely due to the unimodal nature of theGaussian distribution used as the basis for determining the KLD. Since thedistribution may only be centered around a single point the only way to makethe model more accurately recreate the input is to force the KLD to increase.Rather it seems that adding additional clusters through the use of a GaussianMixture Model as the basis for generating the prior may allow a reduction inthe model loss with while also reducing the KLD (Dilokthanakul et al., 2017[2]).

References

Dilokthanakul, Nat, et al. "Deep Unsupervised Clustering with Gaussian Mixture Variational Autoencoders." arXiv preprint arXiv:1611.02648 (2016).

Im, Daniel Jiwoong, et al. "Denoising criterion for variational auto-encoding framework." arXiv preprint arXiv:1511.06406 (2015).

About

Denoising Variational Autoencoder

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp