yoshitomo-matsubara/torchdistillPublic

NotificationsYou must be signed in to change notification settings
Fork134
Star1.5k

A coding-free framework built on PyTorch for reproducible deep learning studies. PyTorch Ecosystem. 🏆26 knowledge distillation methods presented at CVPR, ICLR, ECCV, NeurIPS, ICCV, etc are implemented so far. 🎁 Trained models, training logs and configurations are available for ensuring the reproducibiliy and benchmark.

License

MIT license

1.5k stars 134 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 1,603 Commits
.github		.github
configs		configs
demo		demo
docs/source		docs/source
examples		examples
tests		tests
torchdistill		torchdistill
.gitignore		.gitignore
.travis.yml		.travis.yml
CITATION.bib		CITATION.bib
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
setup.cfg		setup.cfg
setup.py		setup.py

Repository files navigation

torchdistill: A Modular, Configuration-Driven Framework for Knowledge Distillation

torchdistill (formerlykdkit) offers various state-of-the-art knowledge distillation methodsand enables you to design (new) experiments simply by editing a declarative yaml config file instead of Python code.Even when you need to extract intermediate representations in teacher/student models,you willNOT need to reimplement the models, that often change the interface of the forward, but insteadspecify the module path(s) in the yaml file. Refer tothese papers for more details.

In addition to knowledge distillation, this framework helps you design and perform general deep learning experiments(WITHOUT coding) for reproducible deep learning studies. i.e., it enables you to train models without teacherssimply by excluding teacher entries from a declarative yaml config file.You can find such examples below and inconfigs/sample/.

In December 2023,torchdistill officially joinedPyTorch Ecosystem.

When you refer totorchdistill in your paper, please citethese papersinstead of this GitHub repository.
If you usetorchdistillas part of your work, your citation is appreciated and motivates me to maintain and upgrade this framework!

Documentation

You can find the API documentation and research projects that leveragetorchdistill athttps://yoshitomo-matsubara.net/torchdistill/

Forward hook manager

UsingForwardHookManager, you can extract intermediate representations in model without modifying the interface of its forward function.
This example notebook will give you a better idea of the usage such as knowledge distillation and analysis of intermediate representations.

E.g., extract intermediate representations (feature map) of ResNet-18 for a random input batch

importtorchfromtorchvisionimportmodelsfromtorchdistill.core.forward_hookimportForwardHookManager# Define a model and choose torch devicemodel=models.resnet18(pretrained=False)device=torch.device('cpu')# Register forward hooks for modules of your interestforward_hook_manager=ForwardHookManager(device)forward_hook_manager.add_hook(model,'conv1',requires_input=True,requires_output=False)forward_hook_manager.add_hook(model,'layer1.0.bn2',requires_input=True,requires_output=True)forward_hook_manager.add_hook(model,'fc',requires_input=False,requires_output=True)# Define a random input batch and run the modelx=torch.rand(32,3,224,224)y=model(x)# Extract input and/or output of the modulesio_dict=forward_hook_manager.pop_io_dict()conv1_input=io_dict['conv1']['input']layer1_0_bn2_input=io_dict['layer1.0.bn2']['input']layer1_0_bn2_output=io_dict['layer1.0.bn2']['output']fc_output=io_dict['fc']['output']

1 experiment → 1 declarative PyYAML config file

Intorchdistill, many components and PyTorch modules are abstracted e.g., models, datasets, optimizers, losses,and more! You can define them in a declarative PyYAML config file so that can be seen as a summary of your experiment,and in many cases, you willNOT need to write Python code at all.Take a look at some configurations available inconfigs/.You'll see what modules are abstracted and how they are defined in a declarative PyYAML config file to design an experiment.

E.g., instantiate CIFAR-10 datasets with a declarative PyYAML config file

fromtorchdistill.commonimportyaml_utilconfig=yaml_util.load_yaml_file('./test.yaml')train_dataset=config['datasets']['cifar10/train']test_dataset=config['datasets']['cifar10/test']

test.yaml

datasets:cifar10/train:!import_callkey:'torchvision.datasets.CIFAR10'init:kwargs:root:&root_dir '~/datasets/cifar10'train:Truedownload:Truetransform:!import_callkey:'torchvision.transforms.Compose'init:kwargs:transforms:                -!import_callkey:'torchvision.transforms.RandomCrop'init:kwargs:size:32padding:4                -!import_callkey:'torchvision.transforms.RandomHorizontalFlip'init:kwargs:p:0.5                -!import_callkey:'torchvision.transforms.ToTensor'init:                -!import_callkey:'torchvision.transforms.Normalize'init:kwargs:&normalize_kwargsmean:[0.49139968, 0.48215841, 0.44653091]std:[0.24703223, 0.24348513, 0.26158784]cifar10/test:!import_callkey:'torchvision.datasets.CIFAR10'init:kwargs:root:*root_dirtrain:Falsedownload:Truetransform:!import_callkey:'torchvision.transforms.Compose'init:kwargs:transforms:                -!import_callkey:'torchvision.transforms.ToTensor'init:                -!import_callkey:'torchvision.transforms.Normalize'init:kwargs:*normalize_kwargs

If you want to use your own modules (models, loss functions, datasets, etc) with this framework,you can do so without editing code in the local packagetorchdistill/.
Seethe official documentation andDiscussions for more details.

Benchmarks

Top-1 validation accuracy for ILSVRC 2012 (ImageNet)

Examples

Executable code can be found inexamples/ such as

Image classification: ImageNet (ILSVRC 2012), CIFAR-10, CIFAR-100, etc
Object detection: COCO 2017, etc
Semantic segmentation: COCO 2017, PASCAL VOC, etc
Text classification: GoEmotions, etc
GLUE: CoLA, SST-2, MRPC, STS-B, QQP, MNLI, QNLI, RTE, WNLI, AX

For CIFAR-10 and CIFAR-100, some models are reimplemented and available as pretrained models intorchdistill.More details can be foundhere.

Some Transformer models fine-tuned bytorchdistill for GLUE tasks are available atHugging Face Model Hub.Sample GLUE benchmark results and details can be foundhere.

Google Colab Examples

The following examples are available indemo/.Note that these examples are for Google Colab users and compatible with Amazon SageMaker Studio Lab.Usually,examples/ would be a better referenceif you have your own GPU(s).

CIFAR-10 and CIFAR-100

Training without teacher models
Knowledge distillation

GLUE

Fine-tuning without teacher models
Knowledge distillation

These examples write out test prediction files for you to see the test performance atthe GLUE leaderboard system.

PyTorch Hub

If you find models onPyTorch Hub or GitHub repositories supporting PyTorch Hub,you can import them as teacher/student models simply by editing a declarative yaml config file.

e.g., If you use a pretrained ResNeSt-50 available inhuggingface/pytorch-image-models(akatimm) as a teacher model for ImageNet dataset, you can import the model via PyTorch Hub with the following entryin your declarative yaml config file.

models:teacher_model:key:'resnest50d'repo_or_dir:'huggingface/pytorch-image-models'kwargs:num_classes:1000pretrained:True

How to setup

Python >= 3.9
pipenv (optional)

Install by pip/pipenv

pip3 install torchdistill# or use pipenvpipenv install torchdistill

Install from this repository (not recommended)

git clone https://github.com/yoshitomo-matsubara/torchdistill.gitcd torchdistill/pip3 install -e .# or use pipenvpipenv install "-e ."

Issues / Questions / Requests / Pull Requests

Feel free to create an issue if you find a bug.
If you have either a question or feature request, start a new discussionhere.Please search throughIssues andDiscussions and make sure your issue/question/request has not been addressed yet.

Pull requests are welcome.Please start with an issue and discuss solutions with me rather than start with a pull request.

Citation

If you usetorchdistill in your research, please cite the following papers:
[Paper] [Preprint]

@inproceedings{matsubara2021torchdistill,title={{torchdistill: A Modular, Configuration-Driven Framework for Knowledge Distillation}},author={Matsubara, Yoshitomo},booktitle={International Workshop on Reproducible Research in Pattern Recognition},pages={24--44},year={2021},organization={Springer}}

[Paper] [OpenReview] [Preprint]

@inproceedings{matsubara2023torchdistill,title={{torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case Study on NLP}},author={Matsubara, Yoshitomo},booktitle={Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)},publisher={Empirical Methods in Natural Language Processing},pages={153--164},year={2023}}

Acknowledgments

This project has been supported by Travis CI's OSS credits andJetBrain's Free License Programs (Open Source)since November 2021 and June 2022, respectively.