- Notifications
You must be signed in to change notification settings - Fork62
[CVPR 2020 Workshop] A PyTorch GAN library that reproduces research results for popular GANs.
License
kwotsin/mimicry
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About |Documentation |Tutorial |Gallery |Paper
Mimicry is a lightweight PyTorch library aimed towards the reproducibility of GAN research.
Comparing GANs is often difficult - mild differences in implementations and evaluation methodologies can result in huge performance differences. Mimicry aims to resolve this by providing: (a) Standardized implementations of popular GANs that closely reproduce reported scores; (b) Baseline scores of GANs trained and evaluated under thesame conditions; (c) A framework for researchers to focus onimplementation of GANs without rewriting most of GAN training boilerplate code, with support for multiple GAN evaluation metrics.
We provide a model zoo and set ofbaselines to benchmark different GANs of the same model size trained under the same conditions, using multiple metrics. To ensurereproducibility, we verify scores of our implemented models against reported scores in literature.
The library can be installed with:
pip install git+https://github.com/kwotsin/mimicry.git
See alsosetup information for more.
Training a popular GAN likeSNGAN thatreproduces reported scores can be done as simply as:
importtorchimporttorch.optimasoptimimporttorch_mimicryasmmcfromtorch_mimicry.netsimportsngan# Data handling objectsdevice=torch.device('cuda:0'iftorch.cuda.is_available()else"cpu")dataset=mmc.datasets.load_dataset(root='./datasets',name='cifar10')dataloader=torch.utils.data.DataLoader(dataset,batch_size=64,shuffle=True,num_workers=4)# Define models and optimizersnetG=sngan.SNGANGenerator32().to(device)netD=sngan.SNGANDiscriminator32().to(device)optD=optim.Adam(netD.parameters(),2e-4,betas=(0.0,0.9))optG=optim.Adam(netG.parameters(),2e-4,betas=(0.0,0.9))# Start trainingtrainer=mmc.training.Trainer(netD=netD,netG=netG,optD=optD,optG=optG,n_dis=5,num_steps=100000,lr_decay='linear',dataloader=dataloader,log_dir='./log/example',device=device)trainer.train()# Evaluate fidmmc.metrics.evaluate(metric='fid',log_dir='./log/example',netG=netG,dataset='cifar10',num_real_samples=50000,num_fake_samples=50000,evaluate_step=100000,device=device)
Example outputs:
>>> INFO: [Epoch 1/127][Global Step: 10/100000]| D(G(z)): 0.5941| D(x): 0.9303| errD: 1.4052| errG: -0.6671| lr_D: 0.0002| lr_G: 0.0002| (0.4550 sec/idx)^CINFO: Saving checkpoints from keyboard interrupt...INFO: Training Ended
Tensorboard visualizations:
tensorboard --logdir=./log/example
See further details inexample script, as well as a detailedtutorial on implementing a custom GAN from scratch.
- Evaluating a pre-trained generator model
- Evaluation using custom datasets
- Implementing, training and evaluating a custom GAN
For a fair comparison, we train all models under the same training conditions for each dataset, each implemented using ResNet backbones of the same architectural capacity. We train our models with the Adam optimizer using the popular hyperparameters (β1, β2) = (0.0, 0.9). ndis represents the number of discriminator update steps per generator update step, and niter is simply the number of training iterations.
Abbrev. | Name | Type* |
---|---|---|
DCGAN | Deep Convolutional GAN | Unconditional |
WGAN-GP | Wasserstein GAN with Gradient Penalty | Unconditional |
SNGAN | Spectral Normalization GAN | Unconditional |
cGAN-PD | Conditional GAN with Projection Discriminator | Conditional |
SSGAN | Self-supervised GAN | Unconditional |
InfoMax-GAN | Infomax-GAN | Unconditional |
*Conditional GAN scores are only reported for labelled datasets.
Metric | Method |
---|---|
Inception Score (IS)* | 50K samples at 10 splits |
Fréchet Inception Distance (FID) | 50K real/generated samples |
Kernel Inception Distance (KID) | 50K real/generated samples, averaged over 10 splits. |
*Inception Score can be a poor indicator of GAN performance, as it does not measure diversity and is not domain agnostic. This is why certain datasets with only a single class (e.g. CelebA and LSUN-Bedroom) will perform poorly when using this metric.
Dataset | Split | Resolution |
---|---|---|
CIFAR-10 | Train | 32 x 32 |
CIFAR-100 | Train | 32 x 32 |
ImageNet | Train | 32 x 32 |
STL-10 | Unlabeled | 48 x 48 |
CelebA | All | 64 x 64 |
CelebA | All | 128 x 128 |
LSUN-Bedroom | Train | 128 x 128 |
ImageNet | Train | 128 x 128 |
Resolution | Batch Size | Learning Rate | β1 | β2 | Decay Policy | ndis | niter |
---|---|---|---|---|---|---|---|
128 x 128 | 64 | 2e-4 | 0.0 | 0.9 | None | 2 | 100K |
64 x 64 | 64 | 2e-4 | 0.0 | 0.9 | Linear | 5 | 100K |
Resolution | Model | IS | FID | KID | Checkpoint | Code |
---|---|---|---|---|---|---|
128 x 128 | SNGAN | 2.72 ± 0.01 | 12.93 ± 0.04 | 0.0076 ± 0.0001 | netG.pth | sngan_128.py |
128 x 128 | SSGAN | 2.63 ± 0.01 | 15.18 ± 0.10 | 0.0101 ± 0.0001 | netG.pth | ssgan_128.py |
128 x 128 | InfoMax-GAN | 2.84 ± 0.01 | 9.50 ± 0.04 | 0.0063 ± 0.0001 | netG.pth | infomax_gan_128.py |
64 x 64 | SNGAN | 2.68 ± 0.01 | 5.71 ± 0.02 | 0.0033 ± 0.0001 | netG.pth | sngan_64.py |
64 x 64 | SSGAN | 2.67 ± 0.01 | 6.03 ± 0.04 | 0.0036 ± 0.0001 | netG.pth | ssgan_64.py |
64 x 64 | InfoMax-GAN | 2.68 ± 0.01 | 5.71 ± 0.06 | 0.0033 ± 0.0001 | netG.pth | infomax_gan_64.py |
Resolution | Batch Size | Learning Rate | β1 | β2 | Decay Policy | ndis | niter |
---|---|---|---|---|---|---|---|
128 x 128 | 64 | 2e-4 | 0.0 | 0.9 | Linear | 2 | 100K |
Resolution | Model | IS | FID | KID | Checkpoint | Code |
---|---|---|---|---|---|---|
128 x 128 | SNGAN | 2.30 ± 0.01 | 25.87 ± 0.03 | 0.0141 ± 0.0001 | netG.pth | sngan_128.py |
128 x 128 | SSGAN | 2.12 ± 0.01 | 12.02 ± 0.07 | 0.0077 ± 0.0001 | netG.pth | ssgan_128.py |
128 x 128 | InfoMax-GAN | 2.22 ± 0.01 | 12.13 ± 0.16 | 0.0080 ± 0.0001 | netG.pth | infomax_gan_128.py |
Resolution | Batch Size | Learning Rate | β1 | β2 | Decay Policy | ndis | niter |
---|---|---|---|---|---|---|---|
48 x 48 | 64 | 2e-4 | 0.0 | 0.9 | Linear | 5 | 100K |
Resolution | Model | IS | FID | KID | Checkpoint | Code |
---|---|---|---|---|---|---|
48 x 48 | WGAN-GP | 8.55 ± 0.02 | 43.01 ± 0.19 | 0.0440 ± 0.0003 | netG.pth | wgan_gp_48.py |
48 x 48 | SNGAN | 8.04 ± 0.07 | 39.56 ± 0.10 | 0.0369 ± 0.0002 | netG.pth | sngan_48.py |
48 x 48 | SSGAN | 8.25 ± 0.06 | 37.06 ± 0.19 | 0.0332 ± 0.0004 | netG.pth | ssgan_48.py |
48 x 48 | InfoMax-GAN | 8.54 ± 0.12 | 35.52 ± 0.10 | 0.0326 ± 0.0002 | netG.pth | infomax_gan_48.py |
Resolution | Batch Size | Learning Rate | β1 | β2 | Decay Policy | ndis | niter |
---|---|---|---|---|---|---|---|
32 x 32 | 64 | 2e-4 | 0.0 | 0.9 | Linear | 5 | 100K |
128 x 128 | 64 | 2e-4 | 0.0 | 0.9 | None | 5 | 450k |
Resolution | Model | IS | FID | KID | Checkpoint | Code |
---|---|---|---|---|---|---|
128 x 128 | SNGAN | 13.05 ± 0.05 | 65.74 ± 0.31 | 0.0663 ± 0.0004 | netG.pth | sngan_128.py |
128 x 128 | SSGAN | 13.30 ± 0.03 | 62.48 ± 0.31 | 0.0616 ± 0.0004 | netG.pth | ssgan_128.py |
128 x 128 | InfoMax-GAN | 13.68 ± 0.06 | 58.91 ± 0.14 | 0.0579 ± 0.0004 | netG.pth | infomax_gan_128.py |
32 x 32 | SNGAN | 8.97 ± 0.12 | 23.04 ± 0.06 | 0.0157 ± 0.0002 | netG.pth | sngan_32.py |
32 x 32 | cGAN-PD | 9.08 ± 0.17 | 21.17 ± 0.05 | 0.0145 ± 0.0002 | netG.pth | cgan_pd_32.py |
32 x 32 | SSGAN | 9.11 ± 0.12 | 21.79 ± 0.09 | 0.0152 ± 0.0002 | netG.pth | ssgan_32.py |
32 x 32 | InfoMax-GAN | 9.04 ± 0.10 | 20.68 ± 0.02 | 0.0149 ± 0.0001 | netG.pth | infomax_gan_32.py |
Resolution | Batch Size | Learning Rate | β1 | β2 | Decay Policy | ndis | niter |
---|---|---|---|---|---|---|---|
32 x 32 | 64 | 2e-4 | 0.0 | 0.9 | Linear | 5 | 100K |
Resolution | Model | IS | FID | KID | Checkpoint | Code |
---|---|---|---|---|---|---|
32 x 32 | WGAN-GP | 7.33 ± 0.02 | 22.29 ± 0.06 | 0.0204± 0.0004 | netG.pth | wgan_gp_32.py |
32 x 32 | SNGAN | 7.97 ± 0.06 | 16.77 ± 0.04 | 0.0125 ± 0.0001 | netG.pth | sngan_32.py |
32 x 32 | cGAN-PD | 8.25 ± 0.13 | 10.84 ± 0.03 | 0.0070 ± 0.0001 | netG.pth | cgan_pd_32.py |
32 x 32 | SSGAN | 8.17 ± 0.06 | 14.65 ± 0.04 | 0.0101 ± 0.0002 | netG.pth | ssgan_32.py |
32 x 32 | InfoMax-GAN | 8.08± 0.08 | 15.12 ± 0.10 | 0.0112 ± 0.0001 | netG.pth | infomax_gan_32.py |
Resolution | Batch Size | Learning Rate | β1 | β2 | Decay Policy | ndis | niter |
---|---|---|---|---|---|---|---|
32 x 32 | 64 | 2e-4 | 0.0 | 0.9 | Linear | 5 | 100K |
Resolution | Model | IS | FID | KID | Checkpoint | Code |
---|---|---|---|---|---|---|
32 x 32 | SNGAN | 7.57 ± 0.11 | 22.61 ± 0.06 | 0.0156 ± 0.0003 | netG.pth | sngan_32.py |
32 x 32 | cGAN-PD | 8.92 ± 0.07 | 14.16 ± 0.01 | 0.0085 ± 0.0002 | netG.pth | cgan_pd_32.py |
32 x 32 | SSGAN | 7.56 ± 0.07 | 22.18 ± 0.10 | 0.0161 ± 0.0002 | netG.pth | ssgan_32.py |
32 x 32 | InfoMax-GAN | 7.86 ± 0.10 | 18.94 ± 0.13 | 0.0135 ± 0.0004 | netG.pth | infomax_gan_32.py |
To verify our implementations, we reproduce reported scores in literature by re-implementing the models with the same architecture, training them under the same conditions and evaluate them on CIFAR-10 using the exact same methodology for computing FID.
As FID produces highly biased estimates (where using larger samples lead to a lower score), we reproduce the scores using the same sample sizes, where nreal and nfake refers to the number of real and fake images used respectively for computing FID.
Metric | Model | Score | Reported Score | nreal | nfake | Checkpoint | Code |
---|---|---|---|---|---|---|---|
FID | DCGAN | 28.95 ± 0.42 | 28.12 [4] | 10K | 10K | netG.pth | dcgan_cifar.py |
FID | WGAN-GP | 26.08 ± 0.12 | 29.3† [6] | 50K | 50K | netG.pth | wgan_gp_32.py |
FID | SNGAN | 23.90 ± 0.20 | 21.7 ± 0.21 [1] | 10K | 5K | netG.pth | sngan_32.py |
FID | cGAN-PD | 17.84 ± 0.17 | 17.5 [2] | 10K | 5K | netG.pth | cgan_pd_32.py |
FID | SSGAN | 17.61 ± 0.14 | 17.88 ± 0.64 [3] | 10K | 10K | netG.pth | ssgan_32.py |
FID | InfoMax-GAN | 17.14 ± 0.20 | 17.14 ± 0.20 [5] | 50K | 10K | netG.pth | infomax_gan_32.py |
† Best FID was reported at 53K steps, but we find our score can improve till 100K steps to achieve 23.13 ± 0.13.
If you have found this work useful, please consider citingour work:
@article{lee2020mimicry, title={Mimicry: Towards the Reproducibility of GAN Research}, author={Kwot Sin Lee and Christopher Town}, booktitle={CVPR Workshop on AI for Content Creation}, year={2020},}
For citingInfoMax-GAN:
@InProceedings{Lee_2021_WACV, author = {Lee, Kwot Sin and Tran, Ngoc-Trung and Cheung, Ngai-Man}, title = {InfoMax-GAN: Improved Adversarial Image Generation via Information Maximization and Contrastive Learning}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2021}, pages = {3942-3952}}
[1] Spectral Normalization for Generative Adversarial Networks
[2] cGANs with Projection Discriminator
[3] Self-Supervised GANs via Auxiliary Rotation Loss
[4] A Large-Scale Study on Regularization and Normalization in GANs
[6] GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium
About
[CVPR 2020 Workshop] A PyTorch GAN library that reproduces research results for popular GANs.