Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Code release for the CVPR2017 paper "Improving training of deep neural networks via Singular Value Bounding". Achieving error rates of 3.06% on CIFAR-10 and 16.90% on CIFAR-100.

License

NotificationsYou must be signed in to change notification settings

kui-jia/svb

Repository files navigation

This is the code release for the Singular Value Bounding (SVB) and Bounded Batch Normalization (BBN) methods proposed in the CVPR2017 paper "Improving training of deep neural networks via Singular Value Bounding", authored by Kui Jia, Dacheng Tao, Shenghua Gao, and Xiangmin Xu.

This work investigatessolution properties of neural networks that can potentially lead to good performance. Inspired byorthogonal weight initialization, we propose to constrain the solutions of weight matrices in the orthogonal feasible set during the whole process of network training.

We achieve this by a simple yet effective method called Singular Value Bounding (SVB). In SVB,all singular values of each weight matrix are simply bounded in a narrow band around the value of 1. Based on the same motivation, we also propose Bounded Batch Normalization (BBN), which improvesBatch Normalization by removing its potential risk of ill-conditioned layer transform.

We present both theoretical and empirical results to justify our proposed methods. In particular,we achieve the state-of-the-art results of 3.06% error rate on CIFAR10 and 16.90% on CIFAR100, using off-the-shelf network architectures (Wide ResNets).

Results

Controlled studies on CIFAR10 using 20-layer (left) and 38-layer (right) ConvNets (VGG)

alt text

Validation curves on CIFAR10 using two ConvNets of 20 and 38 weight layers respectively. Blue lines are results by SGD with momentum. Red lines are results by SVB at different values of \epsilon (0.01, 0.05, 0.2, 0.5, 1) in Algorithm 1 of the paper. Black lines are results using both SVB (fixing \epsilon = 0.05) and BBN at different values of \tilde{\epsilon} (0.01, 0.05, 0.2, 0.5, 1)) in Algorithm 2 of the paper. The left two figures are from the 20-layer ConvNet, and the right two ones are from the 38-layer ConvNet.

Ablation studies on CIFAR10 using a 68-layer ResNet

Training methodsError rate (%)
SGD with momentum + BN6.10 (6.22 +/- 0.14)
SVB + BN5.65 (5.79 +/- 0.10)
SVB + BBN5.37 (5.49 +/- 0.11)

Ablation studies on CIFAR10, using a pre-activation ResNet with 68 weight layers of 3 x 3 convolutional filters. Results are in the format ofbest (mean + std) over 5 runs. Standard data augmentation (4 pixels zero-padding plus horizontal flipping) is used.

Results on CIFAR10 and CIFAR100 using Wide ResNets

MethodsCIFAR10CIFAR100# layers# params
Wide ResNet W/O SVB+BBN3.7819.922836.5M
Wide ResNet WITH SVB+BBN3.2417.472836.5M
Wider ResNet W/O SVB+BBN3.6419.252894.2M
Wider ResNet WITH SVB+BBN3.0616.902894.2M

Wide ResNet andWider ResNet in the table above respectively refer to the architectures of “WRN-28-10” and “WRN-28-16” as inWide Residual Networks. Standard data augmentation (4 pixels zero-padding plus horizontal flipping) is used.

Preliminary results on ImageNet

Training methodsTop-1 error (%)Top-1 error (%)
Our Inception-ResNet21.615.91
Our Inception-ResNet WITH SVB+BN21.205.57

Results of single-model (Inception-ResNet) and single-crop testing on the ImageNet validation set.

Usage

Installation

The code depends on theTorch library. Please install torch first.

Support of datasets

CIFAR10

CIFAR100coming soon

ImageNetcoming soon

One may refer tofb.resnet.torch package for how to obtain/pre-process these datasets. Which datasets to use is specified in the fileoptsArgParse.lua.

Support of network architectures

ResNet (pre-activation)

Wide ResNets

Inception-ResNetcoming soon

DenseNetcoming soon

ResNeXtcoming soon

Training

We take a pre-activation version of ResNet as the example to explain how to train a deep network using SVB and BBN methods.

Run the following at command line when SVB and BBN are not used (i.e., training is based on standard SGD with momentum)

th main.lua -cudnnSetting deterministic -netType PreActResNet -ensembleID 1 -BN true -nBaseRecur 11 -kWRN 1 -lrDecayMethod exp -lrBase 0.5 -lrEnd 0.001 -batchSize 128 -nEpoch 160 -nLRDecayStage 80 -weightDecay 0.0001 -svBFlag false -bnsBFlag false

Run the following at command line when SVB is turned on

th main.lua -cudnnSetting deterministic -netType PreActResNet -ensembleID 1 -BN true -nBaseRecur 11 -kWRN 1 -lrDecayMethod exp -lrBase 0.5 -lrEnd 0.001 -batchSize 128 -nEpoch 160 -nLRDecayStage 80 -weightDecay 0.0001 -svBFlag true -svBFactor 1.5 -svBIter 391 -bnsBFlag false

Run the following at command line when both SVB and BBN are turned on

th main.lua -cudnnSetting deterministic -netType PreActResNet -ensembleID 1 -BN true -nBaseRecur 11 -kWRN 1 -lrDecayMethod exp -lrBase 0.5 -lrEnd 0.001 -batchSize 128 -nEpoch 160 -nLRDecayStage 80 -weightDecay 0.0001 -svBFlag true -svBFactor 1.5 -svBIter 391 -bnsBFlag true -bnsBFactor 2 -bnsBType BBN

SettingsvbFactor assvbFactor = 1 + \epsilon (in Algorithm 1). SettingbnsBFactor asbnsBFactor = 1 + \tilde{\epsilon} (in Algorithm 2). SettingkWRN > 1 makes the network architectures become Wide ResNets. Please refer to the fileoptsArgParse.lua for setting of other hyperparameters.

One may also setbnsBTypeasrelto get even better performance.

Use of SVB and BBN in your own code

Implementation of SVB and BNN methods is in the filecnnTrain.lua via functionscnnTrain:fcConvWeightReguViaSVB() andcnnTrain:BNScalingRegu() respectively. One may refer tocnnTrain.lua andmain.lua for use of these two functions.

Contact

kuijia At scut.edu.cn

About

Code release for the CVPR2017 paper "Improving training of deep neural networks via Singular Value Bounding". Achieving error rates of 3.06% on CIFAR-10 and 16.90% on CIFAR-100.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp