- Notifications
You must be signed in to change notification settings - Fork3.6k
Pull requests: NVIDIA/Megatron-LM
Author
Uh oh!
There was an error while loading.Please reload this page.
Label
Uh oh!
There was an error while loading.Please reload this page.
Projects
Uh oh!
There was an error while loading.Please reload this page.
Milestones
Uh oh!
There was an error while loading.Please reload this page.
Reviews
Assignee
Assigned to nobodyLoading
Uh oh!
There was an error while loading.Please reload this page.
Sort
Pull requests list
Add megatron_tokenizer and fix distrib_optimizer
#3521 openedFeb 20, 2026 byshanmugamr1992Loading…
6 tasks
Track and plot per-token off-policy in RL complexity: medium Expert ReviewApply this label to indicate that your PR is ready for expert review.
Fix Megatron-FSDP optimizer state DCP checkpointing, and fix DTensor deepcopy bug from PyTorch 26.01. bugSomething isn't working Expert ReviewApply this label to indicate that your PR is ready for expert review. module: megatron-fsdp
Change the cudagraph distribution from linearly to exponentially-decreasing complexity: low Expert ReviewApply this label to indicate that your PR is ready for expert review.
Multimodal: fix model provider complexity: low Expert ReviewApply this label to indicate that your PR is ready for expert review.
#3508 openedFeb 20, 2026 byfaradawnLoading…
1 of 6 tasks
Multimodal: Fix training script to enable multimodal tokenizer and fix Triton Cache Manager patch complexity: low Expert ReviewApply this label to indicate that your PR is ready for expert review.
#3507 openedFeb 20, 2026 byfaradawnLoading…
1 of 6 tasks
docs: Update docs for 0.16.0 complexity: low Expert ReviewApply this label to indicate that your PR is ready for expert review.
Fixed fp32 residuals Expert ReviewApply this label to indicate that your PR is ready for expert review.
Fix documented shape complexity: low docs-onlydocumentation only (docs or docstrings) Expert ReviewApply this label to indicate that your PR is ready for expert review.
Mmiranda attempt fix build errors docs-onlydocumentation only (docs or docstrings)
#3479 openedFeb 18, 2026 bymegnvidiaLoading…
6 tasks
Automatically add review label complexity: medium Expert ReviewApply this label to indicate that your PR is ready for expert review.
Remove redundant CUDA calls in the LLaVA dataloader Final ReviewApply this label to indicate that your PR is ready for final review.
Add httpx boilerplate to RL OpenAI connections complexity: low Expert ReviewApply this label to indicate that your PR is ready for expert review.
remove attn mask from seqPack complexity: low
#3471 openedFeb 18, 2026 byjalbericiolaLoading…
6 tasks
Multimodal: add tokenizer path complexity: low Expert ReviewApply this label to indicate that your PR is ready for expert review.
#3466 openedFeb 18, 2026 byfaradawnLoading…
1 of 6 tasks
Multimodal: fix VQA dataset selection complexity: low docs-onlydocumentation only (docs or docstrings) Expert ReviewApply this label to indicate that your PR is ready for expert review.
#3464 openedFeb 17, 2026 byfaradawnLoading…
1 of 6 tasks
Fix memory issue in mxfp8 model init community-request complexity: low Final ReviewApply this label to indicate that your PR is ready for final review.
Add separate mtp_grad_scale_func for MTP loss scaling complexity: low Expert ReviewApply this label to indicate that your PR is ready for expert review.
#3459 openedFeb 17, 2026 byyfwLoading…
6 tasks
ProTip! What’s not been updated in a month:updated:<2026-01-20.