- Notifications
You must be signed in to change notification settings - Fork284
Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".
License
kengz/SLM-Lab
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Modular Deep Reinforcement Learning framework in PyTorch.
Companion library of the bookFoundations of Deep Reinforcement Learning.
Documentation ·Benchmark Results
NOTE: v5.0 updates to Gymnasium,
uvtooling, and modern dependencies with ARM support - seeCHANGELOG.md.Book readers:
git checkout v4.1.1forFoundations of Deep Reinforcement Learning code.
![]() | ![]() | ![]() | ![]() |
| BeamRider | Breakout | KungFuMaster | MsPacman |
![]() | ![]() | ![]() | ![]() |
| Pong | Qbert | Seaquest | Sp.Invaders |
![]() | ![]() | ![]() | ![]() |
| Ant | HalfCheetah | Hopper | Humanoid |
![]() | ![]() | ![]() | ![]() |
| Inv.DoublePendulum | InvertedPendulum | Reacher | Walker |
SLM Lab is a software framework forreinforcement learning (RL) research and application in PyTorch. RL trains agents to make decisions by learning from trial and error—like teaching a robot to walk or an AI to play games.
| Feature | Description |
|---|---|
| Ready-to-use algorithms | PPO, SAC, DQN, A2C, REINFORCE—validated on 70+ environments |
| Easy configuration | JSON spec files fully define experiments—no code changes needed |
| Reproducibility | Every run saves its spec + git SHA for exact reproduction |
| Automatic analysis | Training curves, metrics, and TensorBoard logging out of the box |
| Cloud integration | dstack for GPU training, HuggingFace for sharing results |
| Algorithm | Type | Best For | Validated Environments |
|---|---|---|---|
| REINFORCE | On-policy | Learning/teaching | Classic |
| SARSA | On-policy | Tabular-like | Classic |
| DQN/DDQN+PER | Off-policy | Discrete actions | Classic, Box2D, Atari |
| A2C | On-policy | Fast iteration | Classic, Box2D, Atari |
| PPO | On-policy | General purpose | Classic, Box2D, MuJoCo (11), Atari (54) |
| SAC | Off-policy | Continuous control | Classic, Box2D, MuJoCo |
SeeBenchmark Results for detailed performance data.
SLM Lab usesGymnasium (the maintained fork of OpenAI Gym):
| Category | Examples | Difficulty | Docs |
|---|---|---|---|
| Classic Control | CartPole, Pendulum, Acrobot | Easy | Gymnasium Classic |
| Box2D | LunarLander, BipedalWalker | Medium | Gymnasium Box2D |
| MuJoCo | Hopper, HalfCheetah, Humanoid | Hard | Gymnasium MuJoCo |
| Atari | Breakout, MsPacman, and 54 more | Varied | ALE |
Any gymnasium-compatible environment works—just specify its name in the spec.
# Installuv syncuv tool install --editable.# Run demo (PPO CartPole)slm-lab run# PPO CartPoleslm-lab run --render# with visualization# Run custom experimentslm-lab run spec.json spec_name train# local trainingslm-lab run-remote spec.json spec_name train# cloud training (dstack)# Help (CLI uses Typer)slm-lab --help# list all commandsslm-lab run --help# options for run command# Troubleshoot: if slm-lab not found, use uv runuv run slm-lab run
Run experiments on cloud GPUs with automatic result sync to HuggingFace.
# Setupcp .env.example .env# Add HF_TOKENuv tool install dstack# Install dstack CLI# Configure dstack server - see https://dstack.ai/docs/quickstart# Run on cloudslm-lab run-remote spec.json spec_name train# CPU training (default)slm-lab run-remote spec.json spec_name search# CPU ASHA search (default)slm-lab run-remote --gpu spec.json spec_name train# GPU training (for image envs)# Sync resultsslm-lab pull spec_name# Download from HuggingFaceslm-lab list# List available experiments
Config options in.dstack/:run-gpu-train.yml,run-gpu-search.yml,run-cpu-train.yml,run-cpu-search.yml
For a lightweight box that only dispatches dstack runs, syncs results, and generates plots (no local ML training):
uv sync --no-default-groups# skip ML deps (torch, gymnasium, etc.)uv tool install dstackuv run --no-default-groups slm-lab run-remote spec.json spec_name trainuv run --no-default-groups slm-lab pull spec_nameuv run --no-default-groups slm-lab plot -f folder1,folder2If you use SLM Lab in your research, please cite:
@misc{kenggraesser2017slmlab,author ={Keng, Wah Loon and Graesser, Laura},title ={SLM Lab},year ={2017},publisher ={GitHub},journal ={GitHub repository},howpublished ={\url{https://github.com/kengz/SLM-Lab}},}
MIT
About
Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".
Topics
Resources
License
Code of conduct
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Uh oh!
There was an error while loading.Please reload this page.















