- Notifications
You must be signed in to change notification settings - Fork718
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters
License
forresti/SqueezeNet
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
The Caffe-compatible files that you are probably looking for:
SqueezeNet_v1.0/train_val.prototxt #model architectureSqueezeNet_v1.0/solver.prototxt #additional training details (learning rate schedule, etc.)SqueezeNet_v1.0/squeezenet_v1.0.caffemodel #pretrained model parametersIf you find SqueezeNet useful in your research, please consider citing theSqueezeNet paper:
@article{SqueezeNet, Author = {Forrest N. Iandola and Song Han and Matthew W. Moskewicz and Khalid Ashraf and William J. Dally and Kurt Keutzer}, Title = {SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and $<$0.5MB model size}, Journal = {arXiv:1602.07360}, Year = {2016}}Helpful hints:
Getting the SqueezeNet model:
git clone <this repo>.In this repository, we include Caffe-compatible files for the model architecture, the solver configuration, and the pretrained model (4.8MB uncompressed).Batch size. We have experimented with batch sizes ranging from 32 to 1024. In this repo, our default batch size is 512. If implemented naively on a single GPU, a batch size this large may result in running out of memory. An effective workaround is to use hierarchical batching (sometimes called "delayed batching"). Caffe supports hierarchical batching by doing
train_val.prototxt>batch_sizetraining samples concurrently in memory. Aftersolver.prototxt>iter_sizeiterations, the gradients are summed and the model is updated. Mathematically, the batch size isbatch_size * iter_size. In the included prototxt files, we have set(batch_size=32, iter_size=16), but any combination of batch_size and iter_size that multiply to 512 will produce eqivalent results. In fact, with the same random number generator seed, the model will be fully reproducable if trained multiple times. Finally, note that in Caffeiter_sizeis applied while training on the training set but not while testing on the test set.Implementing Fire modules. In the paper, we describe the
expandportion of the Fire layer as a collection of 1x1 and 3x3 filters. Caffe does not natively support a convolution layer that has multiple filter sizes. To work around this, we implementexpand1x1andexpand3x3layers and concatenate the results together in the channel dimension.The SqueezeNet team has released a few variants of SqueezeNet. Each of these include pretrained models, and the non-compressed versions include training protocols, too.
SqueezeNet v1.0 (in this repo), the base model described in our SqueezeNet paper.
Compressed SqueezeNet v1.0, as described in the SqueezeNet paper.
SqueezeNet v1.0 with Residual Connections, which delivers higher accuracy without increasing the model size.
SqueezeNet v1.0 with Dense→Sparse→Dense (DSD) Training, which delivers higher accuracy without increasing the model size.
SqueezeNet v1.1 (in this repo), which requires 2.4x less computation than SqueezeNet v1.0 without diminshing accuracy.
- Community adoption of SqueezeNet:
SqueezeNet in theMXNet framework, by Guo Haria
SqueezeNet in theChainer framework, by Eddie Bell
SqueezeNet in theKeras framework, bydt42.io
SqueezeNet in theTensorflow framework, by Domenick Poster
SqueezeNet in thePyTorch framework, by Marat Dukhan
SqueezeNet in theCoreML framework
Neural Art using SqueezeNet, by Pavel Gonchar
SqueezeNet compression in Ristretto, by Philipp Gysel
If you like SqueezeNet, you might also like SqueezeNext! (SqueezeNext paper,SqueezeNext code)
About
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.