Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

A collection of Reinforcement Learning implementations with PyTorch

License

NotificationsYou must be signed in to change notification settings

hcnoh/rl-collection-pytorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repository is a collection of the following reinforcement learning algorithms:

  • Policy-Gradient
  • Actor-Critic
  • Trust Region Policy Optimization
  • Generalized Advantage Estimation
  • Proximal Policy Optimization

More algorithms will be added on this repository.

In this repository,OpenAI Gym environments such asCartPole-v0,Pendulum-v0, andBipedalWalker-v3 are used. You need to install them before running this repository.

Note: The environment's names could be different depending on the version of OpenAI Gym.

Install Dependencies

  1. Install Python 3.

  2. Install the Python packages inrequirements.txt. If you are using a virtual environment for Python package management, you can install all python packages needed by using the following bash command:

    $ pip install -r requirements.txt
  3. Install other packages to run OpenAI Gym environments. These are dependent on the development setting of your machine.

  4. Install PyTorch. The version of PyTorch should be greater or equal than 1.7.0.

Training and Running

  1. Modifyconfig.json as your machine setting.

  2. Execute training process bytrain.py. An example of usage fortrain.py are following:

    $ python train.py --model_name=trpo --env_name=BipedalWalker-v3

    The following bash command will help you:

    $ python train.py -h
  3. You can run your pre-trained agents by executingrun.py. The usage for runningrun.py is similar to that oftrain.py. You can also check the help message by the following bash bash command:

    $ python run.py -h

The results of CartPole environment

The results of Pendulum environment

The results of BipedalWalker environment

Recent Works

  • The CUDA usage is provided now.
  • Modified some errors in GAE and PPO.
  • Modified some errors about horizon was corrected.

Future Works

  • Find the errors of the Actor-Critic
  • Implement ACER
  • Search other environments to running the algorithms

References

  • An explaination of TRPO line search:link
  • Additional stability method for PPO value function:link

[8]ページ先頭

©2009-2025 Movatter.jp