generalized-advantage-estimation

An implementation from the state-of-the-art family of reinforcement learning algorithms Proximal Policy Optimization using normalized Generalized Advantage Estimation and optional batch mode training. The loss function incorporates an entropy bonus.

python machine-learning reinforcement-learning entropy deep-learning neural-network optimization gae pytorch rl actor-critic proximal-policy-optimization ppo open-ai open-ai-gym generalized-advantage-estimation ppo-pytorch

UpdatedDec 26, 2022
Python

nslyubaykin /rnns_for_pomdp

Star2

Recurrent Policies for Handling Partially Observable Environments

reinforcement-learning gae lstm policy-gradient pomdp proximal-policy-optimization ppo reccurent-neural-network partially-observable-environment generalized-advantage-estimation

UpdatedAug 29, 2022
Jupyter Notebook

nslyubaykin /relax_trpo_example

Star0

Example TRPO implementation with ReLAx

reinforcement-learning gae policy-gradient reinforcement-learning-algorithms continuous-control trpo generalized-advantage-estimation discrete-control

UpdatedAug 29, 2022
Jupyter Notebook

nslyubaykin /relax_ppo_example

Star0

Example PPO implementation with ReLAx

reinforcement-learning gae policy-gradient reinforcement-learning-algorithms continuous-control proximal-policy-optimization ppo generalized-advantage-estimation discrete-control

UpdatedAug 29, 2022
Jupyter Notebook

Improve this page

Add a description, image, and links to thegeneralized-advantage-estimation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with thegeneralized-advantage-estimation topic, visit your repo's landing page and select "manage topics."

Learn more

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

generalized-advantage-estimation

Here are 8 public repositories matching this topic...

bentrevett /pytorch-rl

adik993 /ppo-pytorch

hcnoh /rl-collection-pytorch

leaderj1001 /Phasic-Policy-Gradient

tomasspangelo /proximal-policy-optimization

nslyubaykin /rnns_for_pomdp

nslyubaykin /relax_trpo_example

nslyubaykin /relax_ppo_example

Improve this page

Add this topic to your repo