NVIDIA-NeMo/Megatron-BridgePublic

NotificationsYou must be signed in to change notification settings
Fork72
Star227

HuggingFace conversion and training library for Megatron-based models

docs.nvidia.com/nemo/megatron-bridge/latest/

License

Apache-2.0 license

227 stars 72 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 663 Commits
.github		.github
3rdparty		3rdparty
docker		docker
docs		docs
examples		examples
scripts		scripts
src/megatron/bridge		src/megatron/bridge
tests		tests
tutorials		tutorials
.dockerignore		.dockerignore
.flake8		.flake8
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
Repo-Mbridge.png		Repo-Mbridge.png
codecov.yml		codecov.yml
pyproject.toml		pyproject.toml
ruff.toml		ruff.toml
uv.lock		uv.lock

Repository files navigation

Megatron Bridge

Documentation |Supported Models |Examples |Contributing

Overview

NeMo Megatron Bridge is a PyTorch-native library within theNeMo Framework that provides pretraining, SFT and LoRA for popular LLM and VLM models. It serves as a powerfulbridge, conversion, and verification layer between 🤗 Hugging Face andMegatron Core. It provides bidirectional checkpoint conversion between these formats, enabling other projects to leverage Megatron Core's parallelism capabilities or export models for various inference engines. The bridge includes built-in verification mechanisms to ensure conversion accuracy and checkpoint integrity across different model formats.

On top of the bridge, NeMo Megatron Bridge provides a performant and scalable PyTorch-native training loop that leveragesMegatron Core to deliver state-of-the-art training throughput. It supports pretraining and fine-tuning with features like tensor and pipeline parallelism, and mixed precision (FP8, BF16, FP4, etc.). Users can either use existing 🤗 Hugging Face models or define custom PyTorch model definitions for flexible end-to-end workflows.

NeMo Megatron Bridge is a refactor of theprevious NeMo training stack that adopts a PyTorch-native training loop to provide greater flexibility and customizability for developers.

🔧 Installation

🐳 NeMo Framework container

The best experience, highest performance, and full feature support are provided by theNeMo Framework container. Fetch the most recent $TAG and run the following to start a container:

docker run --rm -it -w /workdir -v$(pwd):/workdir \  --entrypoint bash \  --gpus all \  nvcr.io/nvidia/nemo:${TAG}

For development installation and additional details, please refer to ourContribution guide.

⚡ Quickstart

To get started, install Megatron Bridge or download a NeMo Framework container as describedabove.

huggingface-cli login --token<your token>

Conversion-only quickstart (✅ Core):

frommegatron.bridgeimportAutoBridge# 1) Create a bridge from a Hugging Face model (hub or local path)bridge=AutoBridge.from_hf_pretrained("meta-llama/Llama-3.2-1B",trust_remote_code=True)# 2) Get a Megatron provider and configure parallelism before instantiationprovider=bridge.to_megatron_provider()provider.tensor_model_parallel_size=1provider.pipeline_model_parallel_size=1provider.finalize()# 3) Materialize Megatron Core model(s)model=provider.provide_distributed_model(wrap_with_ddp=False)# 4a) Export Megatron → Hugging Face (full HF folder with config/tokenizer/weights)bridge.save_hf_pretrained(model,"./hf_exports/llama32_1b")# 4b) Or stream only weights (Megatron → HF)forname,weightinbridge.export_hf_weights(model,cpu=True):print(name,tuple(weight.shape))

Training quickstart using pre-configured recipes:

frommegatron.bridge.recipes.llamaimportllama32_1b_pretrain_configfrommegatron.bridge.training.gpt_stepimportforward_stepfrommegatron.bridge.training.pretrainimportpretrainif__name__=="__main__":# The recipe uses the Llama 3.2 1B model configuration from HuggingFacecfg=llama32_1b_pretrain_config(seq_length=1024)# Override training parameterscfg.train.train_iters=10cfg.scheduler.lr_decay_iters=10000cfg.model.vocab_size=8192cfg.tokenizer.vocab_size=cfg.model.vocab_sizepretrain(cfg,forward_step)

You can launch the above script with:

torchrun --nproc-per-node=<num devices> /path/to/script.py

More examples:

For a deeper dive into conversion design and advanced usage, see themodels README.

🚀 Key Features

Bridge with 🤗 Hugging Face: Seamless bidirectional conversion between 🤗 Hugging Face and Megatron formats for interoperability (model bridges,auto bridge,conversion examples)
- Online import/export without intermediate full checkpoints
- Parallelism-aware (TP/PP/VPP/CP/EP/ETP) during conversion
- Memory-efficient per-parameter streaming
- Simple high-levelAutoBridge API with architecture auto-detection
- Optimized paths when Transformer Engine is available
Flexible to Customize: Lightweight custom training loop making it easy to configure custom logic in data loading, distributed training, checkpointing, evaluation and logging (training framework,training utilities)
Supervised & Parameter-Efficient Finetuning: SFT & PEFT implementation tailored for Megatron-based models that supports LoRA, DoRA, and user-defined PEFT methods (PEFT implementations,finetune module,SFT dataset)
SOTA Training Recipes: Pre-configured production-ready training recipes for popular models like Llama 3, with optimized hyperparameters and distributed training configuration (Llama recipes,recipe examples)
Performance Optimization: Built-in support for FP8 training, model parallelism, and memory-efficient techniques to offer high utilization and near-linear scalability to thousands of nodes. (mixed precision,communication overlap,optimizer utilities)

Supported Models

Megatron Bridge provides out-of-the-box bridges and training recipes for a wide range of models, built on top of base model architectures fromMegatron Core. Refer to themodels directory for the most up-to-date list of model bridges.

Supported Models Overview

For more details on supported models, see our documentation:

Large Language Models
Vision Language Models

Model	Checkpoint Conversion	Pretrain Recipes	SFT & LoRA Recipes
DeepSeek V2	✅	✅ (v2)	Coming soon
DeepSeek V2 Lite	✅	✅ (v2-lite)	Coming soon
DeepSeek V3	✅	✅ (v3)	Coming soon
Gemma	✅	Coming soon	Coming soon
Gemma 2	✅	Coming soon	Coming soon
Gemma 3	✅	✅ (1B)	✅ (1B)
Gemma 3-VL	✅	Coming soon	✅ (4B/12B/27B)
GLM-4.5	✅	✅ (106B-Air/355B)	✅ (106B-Air/355B)
GPT-oss	✅	✅ (20B/120B)	✅ (20B/120B)
Llama 2	✅	✅ (7B)	Coming soon
Llama 3	✅	✅ (8B/70B)	✅ (8B/70B)
Llama 3.1	✅	✅ (8B/70B/405B)	✅ (8B/70B/405B)
Llama 3.2	✅	✅ (1B/3B)	✅ (1B/3B)
Llama 3.3	✅	Coming soon	Coming soon
Llama Nemotron	✅	Coming soon	Coming soon
Mistral	✅	Coming soon	Coming soon
Moonlight	✅	✅ (16B)	✅ (16B)
Nemotron	✅	Coming soon	Coming soon
Nemotron-H	✅	✅ (4B/8B/47B/56B)	Coming soon
Nemotron Nano v2	✅	✅ (9B/12B)	Coming soon
Nemotron Nano v2 VL	✅	Coming soon	✅ (9B/12B)
OlMoE	✅	✅ (7B)	✅ (7B)
Qwen2	✅	✅ (500M/1.5B/7B/72B)	✅ (500M/1.5B/7B/72B)
Qwen2.5	✅	✅ (500M/1.5B/7B/14B/32B/72B)	✅ (500M/1.5B/7B/14B/32B/72B)
Qwen2.5-VL	✅	Coming soon	✅ (3B/7B/32B/72B)
Qwen3	✅	✅ (600M/1.7B/4B/8B/14B/32B)	✅ (600M/1.7B/4B/8B/14B/32B)
Qwen3-MoE	✅	✅ (A3B/A22B)	✅ (A3B/A22B)
Qwen3 Next	✅	✅ (80B-A3B)	✅ (80B-A3B)
Qwen3-VL	✅	Coming soon	✅ (8B/A3B-A30B-MoE)

Launching Recipes

For a conceptual overview of how recipes are structured, overridden, and launched with eithertorchrun or NeMo-Run, read theUsing Recipes guide.

Runnable tutorials live intutorials/recipes/llama that covers:

00_quickstart_pretrain.py for mock-data pretraining
01_quickstart_finetune.py + LoRA configs
YAML-driven flows and launch helpers

Performance Benchmarks

For detailed performance benchmarks including throughput metrics across different GPU systems (DGX-GB200, DGX-B200, DGX-H100) and model configurations, see thePerformance Summary in our documentation.

Project Structure

Megatron-Bridge/├── examples/│   ├── models/                  # Bridge usage examples│   └── recipes/                 # Training examples├── src/megatron/bridge/│   ├── data/                    # Dataloaders and iterators│   ├── models/                  # Hugging Face bridge infrastructure and model-specific implementations│   │   ├── llama/               # Llama model providers│   │   └── .../                 # Other models (gpt, t5, etc.)│   ├── peft/                    # PEFT transformations and wrappers│   ├── recipes/                 # Complete training recipes│   ├── training/                # Training loop components│   │   ├── tokenizers/          # Tokenizer library│   │   └── utils/               # Training-specific utilities│   └── utils/                   # Generic utilities for repo-wide usage└── tests/                       # Comprehensive test suite

Acknowledgement & Contributing

Megatron-Bridge is the continuation ofMBridge byYan Bai. We appreciate all the contribution and adoptions by the community partners:

veRL has adopted MBridge as a connector to Megatron-Core.
slime has adopted MBridge as Megatron-Core checkpoint converter.
SkyRL has adopted MBridge as Megatron-Core connector and is migrating to Megatron-Bridge.
Nemo-RL has adopted Megatron-Bridge as Megatron-Core connector.
Community contributions: Special thanks toGuanyou He andJunyu Wu from Weixin Group Infrastructure Center.

Please see ourContributor Guidelines for more information on how to get involved.

About

HuggingFace conversion and training library for Megatron-based models

docs.nvidia.com/nemo/megatron-bridge/latest/

Languages

Python98.6%
Other1.4%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Megatron Bridge

Overview

🔧 Installation

🐳 NeMo Framework container

⚡ Quickstart

🚀 Key Features

Supported Models

Supported Models Overview

Launching Recipes

Performance Benchmarks

Project Structure

Acknowledgement & Contributing

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Contributors54

Languages

Movatterモバイル変換

License

NVIDIA-NeMo/Megatron-Bridge

Folders and files

Latest commit

History

Repository files navigation

Megatron Bridge

Overview

🔧 Installation

🐳 NeMo Framework container

⚡ Quickstart

🚀 Key Features

Supported Models

Supported Models Overview

Launching Recipes

Performance Benchmarks

Project Structure

Acknowledgement & Contributing

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Contributors54

Languages

Packages