RLOpensource/IMPALA-Distributed-TensorflowPublic

NotificationsYou must be signed in to change notification settings
Fork7
Star40

You must be signed in to change notification settings

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
source		source
.gitignore		.gitignore
README.md		README.md
buffer_queue.py		buffer_queue.py
config.py		config.py
lstm_test.py		lstm_test.py
model.py		model.py
py_process.py		py_process.py
start.sh		start.sh
trainer_boxing.py		trainer_boxing.py
trainer_breakout.py		trainer_breakout.py
trainer_demon.py		trainer_demon.py
trainer_gunner.py		trainer_gunner.py
trainer_invader.py		trainer_invader.py
trainer_kungfu.py		trainer_kungfu.py
trainer_pong.py		trainer_pong.py
trainer_seaquest.py		trainer_seaquest.py
vtrace.py		vtrace.py
wrappers.py		wrappers.py

Repository files navigation

Implementation of IMPALA with Distributed Tensorflow

Information

These results are from only 32 threads.
A total of 32 CPUs were used, 4 environments were configured for each game type, and a total of 8 games were learned.
Tensorflow Implementation
Use DQN model to inference action
Use distributed tensorflow to implement Actor
Training with 1 day
Same parameter ofpaper

start learning rate = 0.0006end learning rate = 0learning frame = 1e6gradient clip norm = 40trajectory = 20batch size = 32reward clipping = -1 ~ 1

Dependency

tensorflow==1.14.0gym[atari]numpytensorboardXopencv-python

Overall Schema

Model Architecture

How to Run

showstart.sh
Learning 8 types of games at a time, one of which uses 4 environments.

Result

Video



Breakout	Pong	Seaquest	Space-Invader

Boxing	Star-Gunner	Kung-Fu	Demon

Plotting

Compare reward clipping method

Video



abs_one	soft_asymmetric

Plotting




abs_one


soft_asymmetric

Is Attention Really Working?

Above Blocks are ignored.
Ball and Bar are attentioned.
Empty space are attentioned because of less trained.

Todo

Only CPU Training method
Distributed tensorflow
Model fix for preventing collapsed
Reward Clipping Experiment
Parameter copying from global learner
Add Relational Reinforcement Learning
Add Action information to Model
Multi Task Learning
Add Recurrent Model
Training on GPU, Inference on CPU

Reference

About

No description or website provided.

Releases

No releases published

Packages

No packages published

Languages

You can’t perform that action at this time.

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Implementation of IMPALA with Distributed Tensorflow

Information

Dependency

Overall Schema

Model Architecture

How to Run

Result

Video

Plotting

Compare reward clipping method

Video

Plotting

Is Attention Really Working?

Todo

Reference

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Languages

Movatterモバイル変換

RLOpensource/IMPALA-Distributed-Tensorflow

Folders and files

Latest commit

History

Repository files navigation

Implementation of IMPALA with Distributed Tensorflow

Information

Dependency

Overall Schema

Model Architecture

How to Run

Result

Video

Plotting

Compare reward clipping method

Video

Plotting

Is Attention Really Working?

Todo

Reference

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Languages

Packages