Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork935
Massively Parallel Deep Reinforcement Learning. 🔥
License
AI4Finance-Foundation/ElegantRL
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation

“小雅”源于《诗经·小雅·鹤鸣》,旨在「他山之石,可以攻玉」。
ElegantRL (website) is developed for users/developers with the following advantages:
Cloud-native: follows a cloud-native paradigm through micro-service architecture and containerization, and supportsElegantRL-Podracer andFinRL-Podracer.
Scalable: fully exploits the parallelism of DRL algorithms, making it easily scale out to hundreds or thousands of computing nodes on a cloud platform, say, aDGX SuperPOD platform with thousands of GPUs.
Elastic: allows to elastically and automatically allocate computing resources on the cloud.
Lightweight: the core codes have <1,000 lines (checkElegantrl_Helloworld).
Efficient: in many testing cases (e.g., single-GPU/multi-GPU/GPU-cloud), we find it more efficient thanRay RLlib.
Stable: much much much more stable thanStable Baselines 3 by utilizing various methods such as the Hamiltonian term.
Practical: used in multipe projects (RLSolver,FinRL,FinRL-Meta, etc.)
Massively parallel simulations are used in multipe projects (RLSolver,FinRL, etc.); therefore, the sampling speed is high since we can build many many GPU-based environments.
ElegantRL implements the following model-free deep reinforcement learning (DRL) algorithms:
- DDPG, TD3, SAC, PPO, REDQ for continuous actions in single-agent environment,
- DQN, Double DQN, D3QN for discrete actions in single-agent environment,
- QMIX, VDN, MADDPG, MAPPO, MATD3 in multi-agent environment.
For more details of DRL algorithms, please refer to the educational webpageOpenAI Spinning Up.
ElegantRL supports the following simulators:
- Isaac Gym for massively parallel simulations,
- OpenAI Gym, MuJoCo, PyBullet, FinRL for benchmarking.
- [Towardsdatascience]A New Era of Massively Parallel Simulation: A Practical Tutorial Using ElegantRL, Nov. 2, 2022.
- [MLearning.ai]ElegantRL: Much More Stable Deep Reinforcement Learning Algorithms than Stable-Baseline3, Mar. 3, 2022.
- [Towardsdatascience]ElegantRL-Podracer: A Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning, Dec. 11, 2021.
- [Towardsdatascience]ElegantRL: Mastering PPO Algorithms, May. 3, 2021.
- [MLearning.ai]ElegantRL Demo: Stock Trading Using DDPG (Part II), Apr. 19, 2021.
- [MLearning.ai]ElegantRL Demo: Stock Trading Using DDPG (Part I), Mar. 28, 2021.
- [Towardsdatascience]ElegantRL-Helloworld: A Lightweight and Stable Deep Reinforcement Learning Library, Mar. 4, 2021.
For beginners, we maintainElegantRL-Helloworld as a tutorial. Its goal is to get hands-on experience with ELegantRL.
- Run thetutorial code and learn about RL algorithms in this order: DQN -> DDPG -> PPO
- Write thesuggestion for Eleagant_HelloWorld in github issue.
One sentence summary: an agent (agent.py) with Actor-Critic networks (net.py) is trained (run.py) by interacting with an environment (env.py).
elegantrl # main folder
- agents # a collection of DRL algorithms
- AgentXXX.py # a collection of one kind of DRL algorithms
- net.py # a collection of network architectures
- envs # a collection of environments
- XxxEnv.py # a training environment for RL
- train # a collection of training programs- demo.py # a collection of demos
- config.py # configurations (hyper-parameter)
- run.py # training loop
- worker.py # the worker class (explores the env, saving the data to replay buffer)
- learner.py # the learner class (update the networks, using the data in replay buffer)
- evaluator.py # the evaluator class (evaluate the cumulative rewards of policy network)
- replay_buffer.py # the buffer class (save sequences of transitions for training)
- agents # a collection of DRL algorithms
elegantrl_helloworld # tutorial version
- config.py # configurations (hyper-parameter)
- agent.py # DRL algorithms
- net.py # network architectures
- run.py # training loop
- env.py # environments for RL training
examples # a collection of example codes
ready-to-run Google-Colab notebooks
- quickstart_Pendulum_v1.ipynb
- tutorial_BipedalWalker_v3.ipynb
- tutorial_Creating_ChasingVecEnv.ipynb
- tutorial_LunarLanderContinuous_v2.ipynb
unit_tests # a collection of tests
Experiments on Ant (MuJoCo), Humainoid (MuJoCo), Ant (Isaac Gym), Humanoid (Isaac Gym) # from left to right
ElegantRL fully supports Isaac Gym that runs massively parallel simulation (e.g., 4096 sub-envs) on one GPU.
Experiment on Hopper-v2 # ElegantRL achieves much smaller variance (average over 8 runs).
Also, PPO+H in ElegantRL completed the training process of 5M samples about 6x faster than Stable-Baseline3.
Our tests are written with the built-inunittest
Python module for easy access. In order to run a specific test file (for example,test_training_agents.py
), use the following command from the root directory:
python -m unittest unit_tests/test_training_agents.py
In order to run all the tests sequentially, you can use the following command:
python -m unittest discover
Please note that some of the tests requireIsaac Gym to be installed on your system. If it is not, any tests related to Isaac Gym will fail.
We welcome any contributions to the codebase, but we ask that you pleasedo not submit/push code that breaks the tests. Also, please shy away from modifying the tests just to get your proposed changes to pass them. As it stands, the tests on their own are quite minimal (instantiating environments, training agents for one step, etc.), so if they're breaking, it's almost certainly a problem with your code and not with the tests.
We're actively working on refactoring and trying to make the codebase cleaner and more performant as a whole. If you'd like to help us clean up some code, we'd strongly encourage you to also watchUncle Bob's clean coding lessons if you haven't already.
Necessary:| Python 3.6+ || PyTorch 1.6+ |Not necessary:| Numpy 1.18+ | For ReplayBuffer. Numpy will be installed along with PyTorch.| gym 0.17.0 | For env. Gym provides tutorial env for DRL training. (env.render() bug in gym==0.18 pyglet==1.6. Change to gym==0.17.0, pyglet==1.5)| pybullet 2.7+ | For env. We use PyBullet (free) as an alternative of MuJoCo (not free).| box2d-py 2.3.8 | For gym. Use pip install Box2D (instead of box2d-py)| matplotlib 3.2 | For plots.pip3 install gym==0.17.0 pybullet Box2D matplotlib # or pip install -r requirements.txtTo install StarCraftII env,bash ./elegantrl/envs/installsc2.shpip install -r sc2_requirements.txt
To cite this repository:
@misc{erl, author = {Liu, Xiao-Yang and Li, Zechu and Zhu, Ming and Wang, Zhaoran and Zheng, Jiahao}, title = {{ElegantRL}: Massively Parallel Framework for Cloud-native Deep Reinforcement Learning}, year = {2021}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/AI4Finance-Foundation/ElegantRL}},}
@article{liu2021elegantrl, title={ElegantRL-Podracer: Scalable and elastic library for cloud-native deep reinforcement learning}, author={Liu, Xiao-Yang and Li, Zechu and Yang, Zhuoran and Zheng, Jiahao and Wang, Zhaoran and Walid, Anwar and Guo, Jian and Jordan, Michael I}, journal={NeurIPS, Workshop on Deep Reinforcement Learning}, year={2021}}
About
Massively Parallel Deep Reinforcement Learning. 🔥
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Sponsor this project
Uh oh!
There was an error while loading.Please reload this page.