kui-jia/svbPublic

NotificationsYou must be signed in to change notification settings
Fork2
Star6

Code release for the CVPR2017 paper "Improving training of deep neural networks via Singular Value Bounding". Achieving error rates of 3.06% on CIFAR-10 and 16.90% on CIFAR-100.

License

BSD-3-Clause license

6 stars 2 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
Data		Data
Exps		Exps
Utils		Utils
LICENSE		LICENSE
README.md		README.md
checkpoint.lua		checkpoint.lua
cifar10Init.lua		cifar10Init.lua
cifar10_PreActResNet.lua		cifar10_PreActResNet.lua
cnnTrain.lua		cnnTrain.lua
dataLoader.lua		dataLoader.lua
main.lua		main.lua
optsArgParse.lua		optsArgParse.lua

Repository files navigation

Improving training of deep neural networks via Singular Value Bounding

This is the code release for the Singular Value Bounding (SVB) and Bounded Batch Normalization (BBN) methods proposed in the CVPR2017 paper "Improving training of deep neural networks via Singular Value Bounding", authored by Kui Jia, Dacheng Tao, Shenghua Gao, and Xiangmin Xu.

This work investigatessolution properties of neural networks that can potentially lead to good performance. Inspired byorthogonal weight initialization, we propose to constrain the solutions of weight matrices in the orthogonal feasible set during the whole process of network training.

We achieve this by a simple yet effective method called Singular Value Bounding (SVB). In SVB,all singular values of each weight matrix are simply bounded in a narrow band around the value of 1. Based on the same motivation, we also propose Bounded Batch Normalization (BBN), which improvesBatch Normalization by removing its potential risk of ill-conditioned layer transform.

We present both theoretical and empirical results to justify our proposed methods. In particular,we achieve the state-of-the-art results of 3.06% error rate on CIFAR10 and 16.90% on CIFAR100, using off-the-shelf network architectures (Wide ResNets).

Project page:http://www.aperture-lab.net/research/svb/

Results

Controlled studies on CIFAR10 using 20-layer (left) and 38-layer (right) ConvNets (VGG)

Validation curves on CIFAR10 using two ConvNets of 20 and 38 weight layers respectively. Blue lines are results by SGD with momentum. Red lines are results by SVB at different values of \epsilon (0.01, 0.05, 0.2, 0.5, 1) in Algorithm 1 of the paper. Black lines are results using both SVB (fixing \epsilon = 0.05) and BBN at different values of \tilde{\epsilon} (0.01, 0.05, 0.2, 0.5, 1)) in Algorithm 2 of the paper. The left two figures are from the 20-layer ConvNet, and the right two ones are from the 38-layer ConvNet.

Ablation studies on CIFAR10 using a 68-layer ResNet

Training methods	Error rate (%)
SGD with momentum + BN	6.10 (6.22 +/- 0.14)
SVB + BN	5.65 (5.79 +/- 0.10)
SVB + BBN	5.37 (5.49 +/- 0.11)

Ablation studies on CIFAR10, using a pre-activation ResNet with 68 weight layers of 3 x 3 convolutional filters. Results are in the format ofbest (mean + std) over 5 runs. Standard data augmentation (4 pixels zero-padding plus horizontal flipping) is used.

Results on CIFAR10 and CIFAR100 using Wide ResNets

Methods	CIFAR10	CIFAR100	# layers	# params
Wide ResNet W/O SVB+BBN	3.78	19.92	28	36.5M
Wide ResNet WITH SVB+BBN	3.24	17.47	28	36.5M
Wider ResNet W/O SVB+BBN	3.64	19.25	28	94.2M
Wider ResNet WITH SVB+BBN	3.06	16.90	28	94.2M

Wide ResNet andWider ResNet in the table above respectively refer to the architectures of “WRN-28-10” and “WRN-28-16” as inWide Residual Networks. Standard data augmentation (4 pixels zero-padding plus horizontal flipping) is used.

Preliminary results on ImageNet

Training methods	Top-1 error (%)	Top-1 error (%)
Our Inception-ResNet	21.61	5.91
Our Inception-ResNet WITH SVB+BN	21.20	5.57

Results of single-model (Inception-ResNet) and single-crop testing on the ImageNet validation set.

Usage

Installation

The code depends on theTorch library. Please install torch first.

Support of datasets

CIFAR10

CIFAR100coming soon

ImageNetcoming soon

One may refer tofb.resnet.torch package for how to obtain/pre-process these datasets. Which datasets to use is specified in the fileoptsArgParse.lua.

Support of network architectures

ResNet (pre-activation)

Wide ResNets

Inception-ResNetcoming soon

DenseNetcoming soon

ResNeXtcoming soon

Training

We take a pre-activation version of ResNet as the example to explain how to train a deep network using SVB and BBN methods.

Run the following at command line when SVB and BBN are not used (i.e., training is based on standard SGD with momentum)

th main.lua -cudnnSetting deterministic -netType PreActResNet -ensembleID 1 -BN true -nBaseRecur 11 -kWRN 1 -lrDecayMethod exp -lrBase 0.5 -lrEnd 0.001 -batchSize 128 -nEpoch 160 -nLRDecayStage 80 -weightDecay 0.0001 -svBFlag false -bnsBFlag false

Run the following at command line when SVB is turned on

th main.lua -cudnnSetting deterministic -netType PreActResNet -ensembleID 1 -BN true -nBaseRecur 11 -kWRN 1 -lrDecayMethod exp -lrBase 0.5 -lrEnd 0.001 -batchSize 128 -nEpoch 160 -nLRDecayStage 80 -weightDecay 0.0001 -svBFlag true -svBFactor 1.5 -svBIter 391 -bnsBFlag false

Run the following at command line when both SVB and BBN are turned on

th main.lua -cudnnSetting deterministic -netType PreActResNet -ensembleID 1 -BN true -nBaseRecur 11 -kWRN 1 -lrDecayMethod exp -lrBase 0.5 -lrEnd 0.001 -batchSize 128 -nEpoch 160 -nLRDecayStage 80 -weightDecay 0.0001 -svBFlag true -svBFactor 1.5 -svBIter 391 -bnsBFlag true -bnsBFactor 2 -bnsBType BBN

SettingsvbFactor assvbFactor = 1 + \epsilon (in Algorithm 1). SettingbnsBFactor asbnsBFactor = 1 + \tilde{\epsilon} (in Algorithm 2). SettingkWRN > 1 makes the network architectures become Wide ResNets. Please refer to the fileoptsArgParse.lua for setting of other hyperparameters.

One may also setbnsBTypeasrelto get even better performance.

Use of SVB and BBN in your own code

Implementation of SVB and BNN methods is in the filecnnTrain.lua via functionscnnTrain:fcConvWeightReguViaSVB() andcnnTrain:BNScalingRegu() respectively. One may refer tocnnTrain.lua andmain.lua for use of these two functions.

Contact

kuijia At scut.edu.cn

About

Code release for the CVPR2017 paper "Improving training of deep neural networks via Singular Value Bounding". Achieving error rates of 3.06% on CIFAR-10 and 16.90% on CIFAR-100.

Releases

No releases published

Packages

No packages published

Languages

Lua100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Improving training of deep neural networks via Singular Value Bounding

Project page:http://www.aperture-lab.net/research/svb/

Results

Controlled studies on CIFAR10 using 20-layer (left) and 38-layer (right) ConvNets (VGG)

Ablation studies on CIFAR10 using a 68-layer ResNet

Results on CIFAR10 and CIFAR100 using Wide ResNets

Preliminary results on ImageNet

Usage

Installation

Support of datasets

Support of network architectures

Training

Use of SVB and BBN in your own code

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Languages

Movatterモバイル変換

License

kui-jia/svb

Folders and files

Latest commit

History

Repository files navigation

Improving training of deep neural networks via Singular Value Bounding

Project page:http://www.aperture-lab.net/research/svb/

Results

Controlled studies on CIFAR10 using 20-layer (left) and 38-layer (right) ConvNets (VGG)

Ablation studies on CIFAR10 using a 68-layer ResNet

Results on CIFAR10 and CIFAR100 using Wide ResNets

Preliminary results on ImageNet

Usage

Installation

Support of datasets

Support of network architectures

Training

Use of SVB and BBN in your own code

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Languages

Packages