- Notifications
You must be signed in to change notification settings - Fork721
Repo for external large-scale work
License
facebookresearch/metaseq
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
A codebase for working withOpen Pre-trained Transformers, originally forked fromfairseq.
The OPT 125M--66B models are now available inHugging Face Transformers. You can access them under thefacebook organization on theHugging Face Hub
The OPT 125M--175B models are now supported in theAlpa project, whichenables serving OPT-175B with more flexible parallelisms on older generations of GPUs, such as 40GB A100, V100, T4, M60, etc.
The OPT models are now supported in theColossal-AI, which helps users to efficiently and quickly deploy OPT models training and inference, reducing large AI model budgets and scaling down the labor cost of learning and deployment.
The OPT 125M--66B models can be executed withCTranslate2, which is a fast inference engine for Transformer models. The project integrates theSmoothQuant technique to allow 8-bit quantization of OPT models. See theusage example to get started.
The OPT models can be served withFasterTransformer, a highly optimized inference framework written and maintained by NVIDIA. We provide instructions to convert OPT checkpoints into FasterTransformer format anda usage example with some benchmark results.
The OPT models can be finetuned usingDeepSpeed. See theDeepSpeed-Chat example to get started.
Followsetup instructions here to get started.
If you have any questions, bug reports, or feature requests regarding either the codebase or the models released in the projects section, please don't hesitate to post on ourGithub Issues page.
Please remember to follow ourCode of Conduct.
We welcome PRs from the community!
You can find information about contributing to metaseq in ourContributing document.
Metaseq is currently maintained by the CODEOWNERS:Susan Zhang,Naman Goyal,Punit Singh Koura,Moya Chen,Kurt Shuster,David Esiobu,Igor Molybog,Peter Albert,Andrew Poulton,Nikolay Bashlykov,Binh Tang,Uriel Singer,Yuchen Zhang,Armen Aghajanya,Lili Yu, andAdam Polyak.
The majority of metaseq is licensed under the MIT license, however portions of the project are available under separate license terms:
- Megatron-LM is licensed under theMegatron-LM license
About
Repo for external large-scale work
Resources
License
Code of conduct
Contributing
Security policy
Uh oh!
There was an error while loading.Please reload this page.