This repository was archived by the owner on Jan 16, 2023. It is now read-only.

google-research/seed_rlPublic archive

NotificationsYou must be signed in to change notification settings
Fork149
Star824

SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference. Implements IMPALA and R2D2 algorithms in TF2 with SEED's architecture.

License

Apache-2.0 license

824 stars 149 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 190 Commits
agents		agents
atari		atari
common		common
dmlab		dmlab
docker		docker
docs		docs
football		football
gcp		gcp
grpc		grpc
mujoco		mujoco
tests		tests
AUTHORS		AUTHORS
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
run_local.sh		run_local.sh
stop_local.sh		stop_local.sh

Repository files navigation

SEED (archived)

This repository contains an implementation of distributed reinforcement learningagent where both training and inference are performed on the learner.

The project is a research project and has now been archived. There will be no further updates.

Four agents are implemented:

The code is already interfaced with the following environments:

However, any reinforcement learning environment using thegymAPI can be used.

For a detailed description of the architecture please readour paper.Please cite the paper if you use the code from this repository in your work.

Bibtex

@article{espeholt2019seed,    title={SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference},    author={Lasse Espeholt and Rapha{\"e}l Marinier and Piotr Stanczyk and Ke Wang and Marcin Michalski},    year={2019},    eprint={1910.06591},    archivePrefix={arXiv},    primaryClass={cs.LG}}

Pull Requests

At this time, we do not accept pull requests. We are happy to link to forksthat add interesting functionality.

Prerequisites

There are a few steps you need to take before playing with SEED. Instructionsbelow assume you run the Ubuntu distribution.

Install docker by following instructions athttps://docs.docker.com/install/linux/docker-ce/ubuntu/.You need 19.03 version or later due to required GPU support.
Make sure docker works as non-root user by following instructions athttps://docs.docker.com/install/linux/linux-postinstall, sectionManage Docker as a non-root user.
Install git:

apt-get install git

Clone SEED git repository:

git clone https://github.com/google-research/seed_rl.gitcd seed_rl

Local Machine Training on a Single Level

To easily start with SEED we provide a way of running it on a localmachine. You just need to run one of the following commands (adjustingnumber of actors andnumber of envs. per actor/env. batch sizeto your machine):

./run_local.sh [Game] [Agent] [number of actors] [number of envs. per actor]./run_local.sh atari r2d2 4 4./run_local.sh football vtrace 4 1./run_local.sh dmlab vtrace 4 4./run_local.sh mujoco ppo 4 32 --gin_config=/seed_rl/mujoco/gin/ppo.gin

It will build a Docker image using SEED source code and start the traininginside the Docker image. Note that hyper parameters are not tuned in the runsabove. Tensorboard is started as part of the training. It can be viewed underhttp://localhost:6006 by default.

We also provide a sample script for running training with tuned parameters forHalfCheetah-v2. This setup runs training with 8x32=256 parallel environments tomake training faster. The sample complexity can be improved at the costof slower training by running fewer environments and increasing theunroll_length parameter.

./mujoco/local_baseline_HalfCheetah-v2.sh

Distributed Training using AI Platform

Note that training with AI Platform results in charges for using compute resources.

The first step is to configure GCP and a Cloud project you will use for training:

Install Cloud SDK following instructions athttps://cloud.google.com/sdk/installand setup up your GCP project.
Make sure that billing is enabled for your project.
Enable the AI Platform ("Cloud Machine Learning Engine") and Compute Engine APIs.
Grant access to the AI Platform service accounts as described athttps://cloud.google.com/ml-engine/docs/working-with-cloud-storage.
Cloud-authenticate in your shell, so that SEED scripts can use your project:

gcloud auth logingcloud configset project [YOUR_PROJECT]

Then you just need to execute one of the provided scenarios:

gcp/train_[scenario_name].sh

This will build the Docker image, push it to the repository which AI Platformcan access and start the training process on the Cloud. Follow output of the commandfor progress. You can also view the running training jobs athttps://console.cloud.google.com/ml/jobs

DeepMind Lab Level Cache

By default majority of DeepMind Lab's CPU usage is generated by creating newscenarios. This cost can be eliminated by enabling level cache. To enable it,set thelevel_cache_dir flag in thedmlab/config.py.As there are many unique episodes it is a good idea to share the same cacheacross multiple experiments.For AI Platform you can add--level_cache_dir=gs://${BUCKET_NAME}/dmlab_cacheto the list of parameters passed ingcp/submit.sh to the experiment.

Baseline data on ATARI-57

We provide baseline training data for SEED's R2D2 trained on ATARI games in theform of training curves (checkpoints and Tensorboard event files coming soon).We provide data for 4 independent seeds run up to 40e9 environment frames.

The hyperparameters and evaluation procedure are the same as in section A.3.1 inthepaper.

Training curves

Training curves are available onthispage.

Checkpoints and Tensorboard event files

Checkpoints and tensorboard event files can be downloaded individuallyhereor asa single (70GBs) zipfile.

Additional links

SEED was used as a core infrastructure piece for theWhat Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study paper.A colab that reproduces plots from the paper can be foundhere.

About

SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference. Implements IMPALA and R2D2 algorithms in TF2 with SEED's architecture.

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

SEED (archived)

Bibtex

Pull Requests

Prerequisites

Local Machine Training on a Single Level

Distributed Training using AI Platform

DeepMind Lab Level Cache

Baseline data on ATARI-57

Training curves

Checkpoints and Tensorboard event files

Additional links

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors5

Uh oh!

Languages

Movatterモバイル変換

License

google-research/seed_rl

Folders and files

Latest commit

History

Repository files navigation

SEED (archived)

Bibtex

Pull Requests

Prerequisites

Local Machine Training on a Single Level

Distributed Training using AI Platform

DeepMind Lab Level Cache

Baseline data on ATARI-57

Training curves

Checkpoints and Tensorboard event files

Additional links

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Contributors5

Uh oh!

Languages

Packages