Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al.https://arxiv.org/abs/1701.06538

License

NotificationsYou must be signed in to change notification settings

davidmrau/mixture-of-experts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

source: https://techburst.io/outrageously-large-neural-network-gated-mixture-of-experts-billions-of-parameter-same-d3e901f2fe05

This repository contains the PyTorch re-implementation of the sparsely-gated MoE layer described in the paperOutrageously Large Neural Networks for PyTorch.

frommoeimportMoEimporttorch# instantiate the MoE layermodel=MoE(input_size=1000,output_size=20,num_experts=10,hidden_size=66,k=4,noisy_gating=True)X=torch.rand(32,1000)#trainmodel.train()# forwardy_hat,aux_loss=model(X)# evaluationmodel.eval()y_hat,aux_loss=model(X)

Requirements

To install the requirements run:

pip install -r requirements.py

Example

The fileexample.py contains a minimal working example illustrating how to train and evaluate the MoE layer with dummy inputs and targets. To run the example:

python example.py

CIFAR 10 example

The filecifar10_example.py contains a minimal working example of the CIFAR 10 dataset. It achieves an accuracy of 39% with arbitrary hyper-parameters and not fully converged. To run the example:

python cifar10_example.py

Used by

FastMoE: A Fast Mixture-of-Expert Training System This implementation was used as a reference PyTorch implementation for single-GPU training.

Acknowledgements

The code is based on the TensorFlow implementation that can be foundhere.

Citing

@misc{rau2019moe,    title={Sparsely-gated Mixture-of-Experts PyTorch implementation},    author={Rau, David},    journal={https://github.com/davidmrau/mixture-of-experts},    year={2019}}

[8]ページ先頭

©2009-2025 Movatter.jp