Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

A Modular Library for Off-Policy Reinforcement Learning with a focus on SafeRL and distributed computing

NotificationsYou must be signed in to change notification settings

schatty/oprl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 

Repository files navigation

oprl_logo

OPRL

A Modular Library for Off-Policy Reinforcement Learning with a focus on SafeRL and distributed computing. Benchmarking resutls are available at associated homepage:Homepage

Code style: black

Disclaimer

The project is under an active renovation, for the old code with D4PG algorithm working with multiprocessing queues andmujoco_py please refer to the branchd4pg_legacy.

Roadmap 🏗

  • Switching tomujoco 3.1.1
  • Replacing multiprocessing queues with RabbitMQ for distributed RL
  • Baselines with DDPG, TQC fordm_control for 1M step
  • Tests
  • Support for SafetyGymnasium
  • Style and readability improvements
  • Baselines with Distributed algorithms fordm_control
  • D4PG logic on top of TQC

Installation

pip install -r requirements.txtcd src && pip install -e .

For working withSafetyGymnasium install it manually

git clone https://github.com/PKU-Alignment/safety-gymnasiumcd safety-gymnasium && pip install -e .

Usage

To run DDPG in a single process

python src/oprl/configs/ddpg.py --env walker-walk

To run distributed DDPG

Run RabbitMQ

docker run -it --rm --name rabbitmq -p 5672:5672 -p 15672:15672 rabbitmq:3.12-management

Run training

python src/oprl/configs/d3pg.py --env walker-walk

Tests

cd src && pip install -e .cd .. && pip install -r tests/functional/requirements.txtpython -m pytest tests

Results

Results for single process DDPG and TQC:ddpg_tqc_eval

Acknowledgements

About

A Modular Library for Off-Policy Reinforcement Learning with a focus on SafeRL and distributed computing

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp