PinnedLoading
- pytorch/pytorch
pytorch/pytorch PublicTensors and Dynamic neural networks in Python with strong GPU acceleration
- NVIDIA/Megatron-LM
NVIDIA/Megatron-LM PublicOngoing research training transformer models at scale
- Megatron-MoE-ModelZoo
Megatron-MoE-ModelZoo PublicBest practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core.
Something went wrong, please refresh the page to try again.
If the problem persists, check theGitHub status page orcontact support.
If the problem persists, check theGitHub status page orcontact support.
Uh oh!
There was an error while loading.Please reload this page.



