Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

carrier of tricks for image classification tutorials using pytorch.

License

NotificationsYou must be signed in to change notification settings

samsgood0310/carrier-of-tricks-for-classification-pytorch

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

carrier-of-tricks-for-classification-pytorch

carrier of tricks for image classification tutorials using pytorch. Based on"Bag of Tricks for Image Classification with Convolutional Neural Networks", 2019 CVPR Paper, implement classification codebase using custom dataset.

0. Experimental Setup (I used 1 GTX 1080 Ti GPU!)

0-1. Prepare Library

pipinstall-rrequirements.txt

0-2. Download dataset (Kaggle Intel Image Classification)

This Data contains around 25k images of size 150x150 distributed under 6 categories.{'buildings' -> 0,'forest' -> 1,'glacier' -> 2,'mountain' -> 3,'sea' -> 4,'street' -> 5 }

0-3. Download ImageNet-Pretrained Weights (EfficientNet, RegNet)

1. Baseline Training Setting

  • ImageNet Pretrained ResNet-50 from torchvision.models
  • 1080 Ti 1 GPU / Batch Size 64 / Epochs 120 / Initial Learning Rate 0.1
  • Training Augmentation: Resize((256, 256)), RandomHorizontalFlip()
  • SGD + Momentum(0.9) + learning rate step decay (x0.1 at 30, 60, 90 epoch)
pythonmain.py--checkpoint_namebaseline;

1-1. Simple Trials

  • Random Initialized ResNet-50 (from scratch)
pythonmain.py--checkpoint_namebaseline_scratch--pretrained0;
  • Adam Optimizer with small learning rate(1e-4 is best!)
pythonmain.py--checkpoint_namebaseline_Adam--optimizerADAM--learning_rate0.0001

2. Bag of Tricks from Original Papers

Before start, i didn't tryNo bias decay,Low-precision Training,ResNet Model Tweaks,Knowledge Distillation.

2-1. Learning Rate Warmup

  • first 5 epochs to warmup
pythonmain.py--checkpoint_namebaseline_warmup--decay_typestep_warmup;pythonmain.py--checkpoint_namebaseline_Adam_warmup--optimizerADAM--learning_rate0.0001--decay_typestep_warmup;

2-2. Zero gamma in Batch Normalization

  • zero-initialize the last BN in each residual branch
pythonmain.py--checkpoint_namebaseline_zerogamma--zero_gamma ;pythonmain.py--checkpoint_namebaseline_warmup_zerogamma--decay_typestep_warmup--zero_gamma;

2-3. Cosine Learning Rate Annealing

pythonmain.py--checkpoint_namebaseline_Adam_warmup_cosine--optimizerADAM--learning_rate0.0001--decay_typecosine_warmup;

2-4. Label Smoothing

  • In paper, use smoothing coefficient as 0.1. I will use same value.
  • The number of classes in imagenet (1000) is different from the number of classes in our dataset (6), but i didn't tune them.
pythonmain.py--checkpoint_namebaseline_Adam_warmup_cosine_labelsmooth--optimizerADAM--learning_rate0.0001--decay_typecosine_warmup--label_smooth0.1;pythonmain.py--checkpoint_namebaseline_Adam_warmup_labelsmooth--optimizerADAM--learning_rate0.0001--decay_typestep_warmup--label_smooth0.1;

2-5. MixUp Augmentation

  • MixUp paper link
  • lambda is a random number drawn from Beta(alpha, alpha) distribution.
  • I will use alpha=0.2 like paper.
pythonmain.py--checkpoint_namebaseline_Adam_warmup_mixup--optimizerADAM--learning_rate0.0001--decay_typestep_warmup--mixup0.2;pythonmain.py--checkpoint_namebaseline_Adam_warmup_cosine_mixup--optimizerADAM--learning_rate0.0001--decay_typecosine_warmup--mixup0.2;pythonmain.py--checkpoint_namebaseline_Adam_warmup_labelsmooth_mixup--optimizerADAM--learning_rate0.0001--decay_typestep_warmup--label_smooth0.1--mixup0.2;pythonmain.py--checkpoint_namebaseline_Adam_warmup_cosine_labelsmooth_mixup--optimizerADAM--learning_rate0.0001--decay_typecosine_warmup--label_smooth0.1--mixup0.2;

3. Additional Tricks from hoya012's survey note

3-1. CutMix Augmentation

  • CutMix paper link
  • I will use same hyper-parameter (cutmix alpha=1.0, cutmix prob=1.0) with ImageNet-Experimental Setting
pythonmain.py--checkpoint_namebaseline_Adam_warmup_cosine_cutmix--optimizerADAM--learning_rate0.0001--decay_typecosine_warmup--cutmix_alpha1.0--cutmix_prob1.0;

3-2. RAdam Optimizer

pythonmain.py--checkpoint_namebaseline_RAdam_warmup_cosine_labelsmooth--optimizerRADAM--learning_rate0.0001--decay_typecosine_warmup--label_smooth0.1;pythonmain.py--checkpoint_namebaseline_RAdam_warmup_cosine_cutmix--optimizerRADAM--learning_rate0.0001--decay_typecosine_warmup--cutmix_alpha1.0--cutmix_prob1.0;

3-3. RandAugment

pythonmain.py--checkpoint_namebaseline_Adam_warmup_cosine_labelsmooth_randaug--optimizerADAM--learning_rate0.0001--decay_typecosine_warmup--label_smooth0.1--randaugment;

3-4. EvoNorm

pythonmain.py--checkpoint_namebaseline_Adam_warmup_cosine_labelsmmoth_evonorm--optimizerADAM--learning_rate0.0001--decay_typecosine_warmup--label_smooth0.1--normevonorm;

3-5. Other Architecture (EfficientNet, RegNet)

  • I will use EfficientNet-B2 which has similar acts with ResNet-50
    • But, because of GPU Memory, i will use small batch size (48)...
  • I will use RegNetY-1.6GF which has similar FLOPS and acts with ResNet-50
pythonmain.py--checkpoint_nameefficientnet_Adam_warmup_cosine_labelsmooth--modelEfficientNet--optimizerADAM--learning_rate0.0001--decay_typecosine_warmup--label_smooth0.1;pythonmain.py--checkpoint_nameefficientnet_Adam_warmup_cosine_labelsmooth_mixup--modelEfficientNet--optimizerADAM--learning_rate0.0001--decay_typecosine_warmup--label_smooth0.1--mixup0.2;pythonmain.py--checkpoint_nameefficientnet_Adam_warmup_cosine_cutmix--modelEfficientNet--optimizerADAM--learning_rate0.0001--decay_typecosine_warmup--cutmix_alpha1.0--cutmix_prob1.0;pythonmain.py--checkpoint_nameefficientnet_RAdam_warmup_cosine_labelsmooth--modelEfficientNet--optimizerRADAM--learning_rate0.0001--decay_typecosine_warmup--label_smooth0.1;pythonmain.py--checkpoint_nameefficientnet_RAdam_warmup_cosine_cutmix--modelEfficientNet--optimizerRADAM--learning_rate0.0001--decay_typecosine_warmup--cutmix_alpha1.0--cutmix_prob1.0;
pythonmain.py--checkpoint_nameregnet_Adam_warmup_cosine_labelsmooth--modelRegNet--optimizerADAM--learning_rate0.0001--decay_typecosine_warmup--label_smooth0.1;pythonmain.py--checkpoint_nameregnet_Adam_warmup_cosine_labelsmooth_mixup--modelRegNet--optimizerADAM--learning_rate0.0001--decay_typecosine_warmup--label_smooth0.1--mixup0.2;pythonmain.py--checkpoint_nameregnet_Adam_warmup_cosine_cutmix--modelRegNet--optimizerADAM--learning_rate0.0001--decay_typecosine_warmup--cutmix_alpha1.0--cutmix_prob1.0;pythonmain.py--checkpoint_nameregnet_RAdam_warmup_cosine_labelsmooth--modelRegNet--optimizerRADAM--learning_rate0.0001--decay_typecosine_warmup--label_smooth0.1;pythonmain.py--checkpoint_nameregnet_RAdam_warmup_cosine_cutmix--modelRegNet--optimizerRADAM--learning_rate0.0001--decay_typecosine_warmup--cutmix_alpha1.0--cutmix_prob1.0;

4. Performance Table

  • B : Baseline

  • A : Adam Optimizer

  • W : Warm up

  • C : Cosine Annealing

  • S : Label Smoothing

  • M : MixUp Augmentation

  • CM: CutMix Augmentation

  • R : RAdam Optimizer

  • RA : RandAugment

  • E : EvoNorm

  • EN : EfficientNet

  • RN : RegNet

AlgorithmValidation AccuracyTest Accuracy
B from scratch86.6886.10
B86.1487.93
B + A93.3493.90
B + A + W93.7794.17
B + A + W + C93.6693.67
B + A + W + S93.9493.77
B + A + W + C + S93.8093.63
B + A + W + M94.0994.20
B + A + W + S + M93.6994.40
B + A + W + C + S + M93.7793.77
:------------::-------------------::-------------:
BAWC + CM94.4493.97
BWCS + R93.2793.73
BAWCS + RA93.9493.80
BAWCS + E93.5593.70
BWC + CM + R94.2393.90
:------------::-------------------::-------------:
EN + AWCSM93.4893.50
EN + AWC + CM94.1994.03
EN + WCS + R93.9194.03
EN + WC + CM + R93.9894.27
:------------::-------------------::-------------:
RN + AWCSM94.3094.30
RN + AWC + CM93.9194.97
RN + WCS + R93.9194.10
RN + WC + CM + R94.4894.37

5. How to run all of experiments?

6. Code Reference

About

carrier of tricks for image classification tutorials using pytorch.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python94.3%
  • Shell5.7%

[8]ページ先頭

©2009-2025 Movatter.jp