- Notifications
You must be signed in to change notification settings - Fork59
PyTorch implementation of deep reinforcement learning algorithms
License
dongminlee94/deep_rl
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This repository contains PyTorch implementations of deep reinforcement learning algorithms.The repository will soon be updated including the PyBullet environments!
- Deep Q-Network (DQN) (V. Mnih et al. 2015)
- Double DQN (DDQN) (H. Van Hasselt et al. 2015)
- Advantage Actor Critic (A2C)
- Vanilla Policy Gradient (VPG)
- Natural Policy Gradient (NPG) (S. Kakade et al. 2002)
- Trust Region Policy Optimization (TRPO) (J. Schulman et al. 2015)
- Proximal Policy Optimization (PPO) (J. Schulman et al. 2017)
- Deep Deterministic Policy Gradient (DDPG) (T. Lillicrap et al. 2015)
- Twin Delayed DDPG (TD3) (S. Fujimoto et al. 2018)
- Soft Actor-Critic (SAC) (T. Haarnoja et al. 2018)
- SAC with automatic entropy adjustment (SAC-AEA) (T. Haarnoja et al. 2018)
- Classic control environments (CartPole-v1, Pendulum-v0, etc.) (as described inhere)
- MuJoCo environments (Hopper-v2, HalfCheetah-v2, Ant-v2, Humanoid-v2, etc.) (as described inhere)
- PyBullet environments (HopperBulletEnv-v0, HalfCheetahBulletEnv-v0, AntBulletEnv-v0, HumanoidDeepMimicWalkBulletEnv-v1 etc.) (as described inhere)
- Observation space: 8
- Action space: 3
- Observation space: 17
- Action space: 6
- Observation space: 111
- Action space: 8
- Observation space: 376
- Action space: 17
- Observation space: 15
- Action space: 3
- Observation space: 26
- Action space: 6
- Observation space: 28
- Action space: 8
- Observation space: 197
- Action space: 36
The repository's high-level structure is:
├── agents └── common ├── results ├── data └── graphs └── save_model
To train all the different agents on PyBullet environments, follow these steps:
git clone https://github.com/dongminlee94/deep_rl.gitcd deep_rlpython run_bullet.py
For other environments, change the last line torun_cartpole.py
,run_pendulum.py
,run_mujoco.py
.
If you want to change configurations of the agents, follow this step:
python run_bullet.py \ --env=HumanoidDeepMimicWalkBulletEnv-v1 \ --algo=sac-aea \ --phase=train \ --render=False \ --load=None \ --seed=0 \ --iterations=200 \ --steps_per_iter=5000 \ --max_step=1000 \ --tensorboard=True \ --gpu_index=0
To watch all the learned agents on PyBullet environments, follow these steps:
python run_bullet.py \ --env=HumanoidDeepMimicWalkBulletEnv-v1 \ --algo=sac-aea \ --phase=test \ --render=True \ --load=envname_algoname_... \ --seed=0 \ --iterations=200 \ --steps_per_iter=5000 \ --max_step=1000 \ --tensorboard=False \ --gpu_index=0
You should copy the saved model name insave_model/envname_algoname_...
and paste the copied name inenvname_algoname_...
. So the saved model will be load.
About
PyTorch implementation of deep reinforcement learning algorithms