NeMo RL Documentation#
Welcome to the NeMo RL documentation. NeMo RL is an open-source post-training library developed by NVIDIA, designed to streamline and scale reinforcement learning methods for multimodal models (LLMs, VLMs, etc.).
This documentation provides comprehensive guides, examples, and references to help you get started with NeMo RL and build powerful post-training pipelines for your models.
Getting Started#
Learn about NeMo RL’s architecture, design philosophy, and key features that make it ideal for scalable reinforcement learning.
Get up and running quickly with examples for both DTensor and Megatron Core training backends.
Step-by-step instructions for installing NeMo RL, including prerequisites, system dependencies, and environment setup.
Explore the current features and upcoming enhancements in NeMo RL, including distributed training, advanced parallelism, and more.
Troubleshooting common issues including missing submodules, Ray dashboard access, and debugging techniques.
Training and Generation#
Learn about DTensor and Megatron Core training backends, their capabilities, and how to choose the right one for your use case.
Discover supported algorithms including GRPO, SFT, DPO, RM, and on-policy distillation with detailed guides and examples.
Learn how to evaluate your models using built-in evaluation datasets and custom evaluation pipelines.
Configure and deploy NeMo RL on multi-node Slurm or Kubernetes clusters for distributed computing.
Guides and Examples#
Reproduce DeepscaleR results with NeMo RL using GRPO on mathematical reasoning tasks.
Step-by-step guide for supervised fine-tuning on the OpenMathInstruct2 dataset.
Create custom reward environments and integrate them with NeMo RL training pipelines.
Learn how to add support for new model architectures in NeMo RL.
Advanced Topics#
Deep dive into NeMo RL’s architecture, APIs, and design decisions for scalable RL.
Tools and techniques for debugging distributed Ray applications and RL training runs.
Optimize large language models with FP8 quantization for faster training and inference.
Build and use Docker containers for reproducible NeMo RL environments.
API Reference#
Comprehensive reference for all NeMo RL modules, classes, functions, and methods. Browse the complete Python API with detailed docstrings and usage examples.