- Notifications
You must be signed in to change notification settings - Fork28
A Modular Library for Off-Policy Reinforcement Learning with a focus on SafeRL and distributed computing
schatty/oprl
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation

A Modular Library for Off-Policy Reinforcement Learning with a focus on SafeRL and distributed computing. Benchmarking resutls are available at associated homepage:Homepage
The project is under an active renovation, for the old code with D4PG algorithm working with multiprocessing queues andmujoco_py
please refer to the branchd4pg_legacy
.
- Switching to
mujoco 3.1.1
- Replacing multiprocessing queues with RabbitMQ for distributed RL
- Baselines with DDPG, TQC for
dm_control
for 1M step - Tests
- Support for SafetyGymnasium
- Style and readability improvements
- Baselines with Distributed algorithms for
dm_control
- D4PG logic on top of TQC
pip install -r requirements.txtcd src && pip install -e .
For working withSafetyGymnasium install it manually
git clone https://github.com/PKU-Alignment/safety-gymnasiumcd safety-gymnasium && pip install -e .
To run DDPG in a single process
python src/oprl/configs/ddpg.py --env walker-walk
To run distributed DDPG
Run RabbitMQ
docker run -it --rm --name rabbitmq -p 5672:5672 -p 15672:15672 rabbitmq:3.12-management
Run training
python src/oprl/configs/d3pg.py --env walker-walk
cd src && pip install -e .cd .. && pip install -r tests/functional/requirements.txtpython -m pytest tests
Results for single process DDPG and TQC:
- DDPG and TD3 code is based on the official TD3 implementation:sfujim/TD3
- TQC code is based on the official TQC implementation:SamsungLabs/tqc
- SafetyGymnasium:PKU-Alignment/safety-gymnasium
About
A Modular Library for Off-Policy Reinforcement Learning with a focus on SafeRL and distributed computing
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.