davidmrau/mixture-of-expertsPublic

NotificationsYou must be signed in to change notification settings
Fork109
Star1.1k

PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al.https://arxiv.org/abs/1701.06538

License

GPL-3.0 license

1.1k stars 109 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
LICENSE		LICENSE
README.md		README.md
cifar10_example.py		cifar10_example.py
example.py		example.py
moe.py		moe.py
requirements.py		requirements.py

Repository files navigation

The Sparsely Gated Mixture of Experts Layer for PyTorch

This repository contains the PyTorch re-implementation of the sparsely-gated MoE layer described in the paperOutrageously Large Neural Networks for PyTorch.

frommoeimportMoEimporttorch# instantiate the MoE layermodel=MoE(input_size=1000,output_size=20,num_experts=10,hidden_size=66,k=4,noisy_gating=True)X=torch.rand(32,1000)#trainmodel.train()# forwardy_hat,aux_loss=model(X)# evaluationmodel.eval()y_hat,aux_loss=model(X)

Requirements

To install the requirements run:

pip install -r requirements.py

Example

The fileexample.py contains a minimal working example illustrating how to train and evaluate the MoE layer with dummy inputs and targets. To run the example:

python example.py

CIFAR 10 example

The filecifar10_example.py contains a minimal working example of the CIFAR 10 dataset. It achieves an accuracy of 39% with arbitrary hyper-parameters and not fully converged. To run the example:

python cifar10_example.py

Used by

FastMoE: A Fast Mixture-of-Expert Training System This implementation was used as a reference PyTorch implementation for single-GPU training.

Acknowledgements

The code is based on the TensorFlow implementation that can be foundhere.

Citing

@misc{rau2019moe,    title={Sparsely-gated Mixture-of-Experts PyTorch implementation},    author={Rau, David},    journal={https://github.com/davidmrau/mixture-of-experts},    year={2019}}

About

PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al.https://arxiv.org/abs/1701.06538

Releases

No releases published

Packages

No packages published

Contributors4

Languages

Python100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

The Sparsely Gated Mixture of Experts Layer for PyTorch

Requirements

Example

CIFAR 10 example

Used by

Acknowledgements

Citing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Contributors4

Uh oh!

Languages

Movatterモバイル変換

License

davidmrau/mixture-of-experts

Folders and files

Latest commit

History

Repository files navigation

The Sparsely Gated Mixture of Experts Layer for PyTorch

Requirements

Example

CIFAR 10 example

Used by

Acknowledgements

Citing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Contributors4

Uh oh!

Languages

Packages