- Notifications
You must be signed in to change notification settings - Fork72
Releases: NVIDIA-NeMo/Megatron-Bridge
Releases · NVIDIA-NeMo/Megatron-Bridge
NVIDIA Megatron-Bridge 0.1.0rc4
6725f70 This commit was created on GitHub.com and signed with GitHub’sverified signature.
NVIDIA Megatron-Bridge 0.1.0rc4Pre-release
Pre-release
- Fix docs build
- Update performance scripts
Assets2
Uh oh!
There was an error while loading.Please reload this page.
NVIDIA Megatron-Bridge 0.1.0rc3
bf71eba This commit was created on GitHub.com and signed with GitHub’sverified signature.
NVIDIA Megatron-Bridge 0.1.0rc3Pre-release
Pre-release
- Model Collection Support
- Llama
- Qwen 2, Qwen 3, Qwen 3 MoE
- DeepSeek
- Mamba
- Migration guide from NeMo 2 to Megatron-Bridge
- Contribution guide for adding a new model
- Checkpoint conversion from Hugging Face to Megatron
- Performance
- MoE LLM
- Change the model to dropless with balanced gating
- Fusion of operators in router function
- Global permutation fusion with A2A dispatcher
- EP A2A communication overlap with computation in both 1F1B pipelining and non-pipelined training
- Precision-aware optimizer update to support BF16 states
- Megatron FSDP
- Migration from mcore FSDP to megatron FSDP
- Fusion of weight gradient copy to reduce-scatter communication buffer to WGRAD GEMM
- Removed redundant optimizer operations
- Use Zero1 (opt and master param sharding) in the replica domain of hybrid FSDP to further lower memory usage
- IB-SHARP support for the IB AllReduce of hybrid FSDP in a patch with NCCL2.28
- MXFP8
- Improved act grad all-gather overlap performance via userbuffer
- Parameter all-gather overlap with computation while the communication buffer sharing with reduce-scatter
- Fusion of MXFP8 scaling factor swizzling kernels
- Use PDL (Programmatic Dependent Launch) for quantization kernels to lower CPU overhead
- Others
- Full iteration cuda graph for dense model without pipelining
- Fusion of activation and cast fusion (currently tensor-wise scaling only)
- Store SwiGLU input in FP8 to save activation memory
- MoE LLM
Assets2
Uh oh!
There was an error while loading.Please reload this page.
NVIDIA Megatron-Bridge 0.1.0a0
NVIDIA Megatron-Bridge 0.1.0a0Pre-release
Pre-release
- Llama and Qwen
- Pretrain/SFT
- PeFT
- Recipe structure with examples for plain python & NeMo Run usage
Assets2
Uh oh!
There was an error while loading.Please reload this page.