devendrachaplot/DeepRL-GroundingPublic

NotificationsYou must be signed in to change notification settings
Fork36
Star238

Train an RL agent to execute natural language instructions in a 3D Environment (PyTorch)

sites.google.com/view/gated-attention/home

License

MIT license

238 stars 36 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
docs		docs
maps		maps
saved		saved
utils		utils
LICENSE.md		LICENSE.md
README.md		README.md
a3c_main.py		a3c_main.py
a3c_test.py		a3c_test.py
a3c_train.py		a3c_train.py
constants.py		constants.py
env.py		env.py
env_test.py		env_test.py
models.py		models.py

Repository files navigation

Gated-Attention Architectures for Task-Oriented Language Grounding

This is a PyTorch implementation of the AAAI-18 paper:

Gated-Attention Architectures for Task-Oriented Language Grounding
Devendra Singh Chaplot, Kanthashree Mysore Sathyendra, Rama Kumar Pasumarthi, Dheeraj Rajagopal, Ruslan Salakhutdinov
Carnegie Mellon University

Project Website:https://sites.google.com/view/gated-attention

This repository contains:

Code for training an A3C-LSTM agent using Gated-Attention
Code for Doom-based language grounding environment

Dependencies

(We recommend usingAnaconda)

Usage

Using the Environment

For running a random agent:

python env_test.py

To play in the environment:

python env_test.py --interactive 1

To change the difficulty of the environment (easy/medium/hard):

python env_test.py -d easy

Training Gated-Attention A3C-LSTM agent

For training a A3C-LSTM agent with 32 threads:

python a3c_main.py --num-processes 32 --evaluate 0

The code will save the best model at./saved/model_best.

To the test the pre-trained model for Multitask Generalization:

python a3c_main.py --evaluate 1 --load saved/pretrained_model

To the test the pre-trained model for Zero-shot Task Generalization:

python a3c_main.py --evaluate 2 --load saved/pretrained_model

To the visualize the model while testing add '--visualize 1':

python a3c_main.py --evaluate 2 --load saved/pretrained_model --visualize 1

To test the trained model, use--load saved/model_best in the above commands.

All arguments for a3c_main.py:

  -h, --help            show this help message and exit  -l MAX_EPISODE_LENGTH, --max-episode-length MAX_EPISODE_LENGTH                        maximum length of an episode (default: 30)  -d DIFFICULTY, --difficulty DIFFICULTY                        Difficulty of the environment, "easy", "medium" or                        "hard" (default: hard)  --living-reward LIVING_REWARD                        Default reward at each time step (default: 0, change                        to -0.005 to encourage shorter paths)  --frame-width FRAME_WIDTH                        Frame width (default: 300)  --frame-height FRAME_HEIGHT                        Frame height (default: 168)  -v VISUALIZE, --visualize VISUALIZE                        Visualize the envrionment (default: 0, use 0 for                        faster training)  --sleep SLEEP         Sleep between frames for better visualization                        (default: 0)  --scenario-path SCENARIO_PATH                        Doom scenario file to load (default: maps/room.wad)  --interactive INTERACTIVE                        Interactive mode enables human to play (default: 0)  --all-instr-file ALL_INSTR_FILE                        All instructions file (default:                        data/instructions_all.json)  --train-instr-file TRAIN_INSTR_FILE                        Train instructions file (default:                        data/instructions_train.json)  --test-instr-file TEST_INSTR_FILE                        Test instructions file (default:                        data/instructions_test.json)  --object-size-file OBJECT_SIZE_FILE                        Object size file (default: data/object_sizes.txt)  --lr LR               learning rate (default: 0.001)  --gamma G             discount factor for rewards (default: 0.99)  --tau T               parameter for GAE (default: 1.00)  --seed S              random seed (default: 1)  -n N, --num-processes N                        how many training processes to use (default: 4)  --num-steps NS        number of forward steps in A3C (default: 20)  --load LOAD           model path to load, 0 to not reload (default: 0)  -e EVALUATE, --evaluate EVALUATE                        0:Train, 1:Evaluate MultiTask Generalization                        2:Evaluate Zero-shot Generalization (default: 0)  --dump-location DUMP_LOCATION                        path to dump models and log (default: ./saved/)

Demostration videos:

Multitask Generalization video:https://www.youtube.com/watch?v=YJG8fwkv7gA

Zero-shot Task Generalization video:https://www.youtube.com/watch?v=JziCKsLrudE

Different stages of training:https://www.youtube.com/watch?v=o_G6was03N0

Cite as

Chaplot, D.S., Sathyendra, K.M., Pasumarthi, R.K., Rajagopal, D. and Salakhutdinov, R., 2017. Gated-Attention Architectures for Task-Oriented Language Grounding. arXiv preprint arXiv:1706.07230. (PDF)

Bibtex:

@article{chaplot2017gated,  title={Gated-Attention Architectures for Task-Oriented Language Grounding},  author={Chaplot, Devendra Singh and Sathyendra, Kanthashree Mysore and Pasumarthi, Rama Kumar and Rajagopal, Dheeraj and Salakhutdinov, Ruslan},  journal={arXiv preprint arXiv:1706.07230},  year={2017}}

Acknowledgements

This repository uses ViZDoom API (https://github.com/mwydmuch/ViZDoom) and parts of the code from the API. The implementation of A3C is borrowed fromhttps://github.com/ikostrikov/pytorch-a3c. The poisson-disc code is borrowed fromhttps://github.com/IHautaI/poisson-disc.

About

Train an RL agent to execute natural language instructions in a 3D Environment (PyTorch)

sites.google.com/view/gated-attention/home

Releases

No releases published

Packages

No packages published

Languages

Python100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Gated-Attention Architectures for Task-Oriented Language Grounding

This repository contains:

Dependencies

Usage

Using the Environment

Training Gated-Attention A3C-LSTM agent

Demostration videos:

Cite as

Bibtex:

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Languages

Movatterモバイル変換

License

devendrachaplot/DeepRL-Grounding

Folders and files

Latest commit

History

Repository files navigation

Gated-Attention Architectures for Task-Oriented Language Grounding

This repository contains:

Dependencies

Usage

Using the Environment

Training Gated-Attention A3C-LSTM agent

Demostration videos:

Cite as

Bibtex:

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Languages

Packages