NotificationsYou must be signed in to change notification settings
Fork13
Star60

Skill-based Model-based Reinforcement Learning (CoRL 2022)

60 stars 13 forks Branches Tags Activity

You must be signed in to change notification settings

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
calvin @ 82626ba		calvin @ 82626ba
config		config
d4rl		d4rl
docs		docs
envs		envs
rolf @ 570c95c		rolf @ 570c95c
scripts		scripts
spirl @ e74ad23		spirl @ e74ad23
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
install.sh		install.sh
pretrained_models.md		pretrained_models.md
requirements.txt		requirements.txt
run.py		run.py
skill_trainer.py		skill_trainer.py
skimo_agent.py		skimo_agent.py
skimo_rollout.py		skimo_rollout.py
spirl_agent.py		spirl_agent.py
spirl_dreamer_agent.py		spirl_dreamer_agent.py
spirl_dreamer_rollout.py		spirl_dreamer_rollout.py
spirl_tdmpc_agent.py		spirl_tdmpc_agent.py
spirl_tdmpc_rollout.py		spirl_tdmpc_rollout.py
spirl_trainer.py		spirl_trainer.py

Repository files navigation

Skill-based Model-based Reinforcement learning (SkiMo)

[Project website] [Paper] [arXiv]

This project is a PyTorch implementation ofSkill-based Model-based Reinforcement Learning, published in CoRL 2022.

Files and Directories

run.py: launches an appropriate trainer based on algorithm
skill_trainer.py: trainer for skill-based approaches
skimo_agent.py: model and training code for SkiMo
skimo_rollout.py: rollout with SkiMo agent
spirl_tdmpc_agent.py: model and training code for SPiRL+TD-MPC
spirl_tdmpc_rollout.py: rollout with SPiRL+TD-MPC
spirl_dreamer_agent.py: model and training code for SPiRL+Dreamer
spirl_dreamer_rollout.py: rollout with SPiRL+Dreamer
spirl_trainer.py: trainer for SPiRL
spirl_agent.py: model for SPiRL
config/: default hyperparameters
calvin/: CALVIN environments
d4rl/:D4RL environments forked by Karl Pertsch. The only change from us is in theinstallation command
envs/: environment wrappers
spirl/:SPiRL code
data/: offline data directory
rolf/: implementation of RL algorithms fromrobot-learning by Youngwoon Lee
log/: training log, evaluation results, checkpoints

Prerequisites

Ubuntu 20.04
Python 3.9
MuJoCo 2.1

Installation

Clone this repository.

git clone --recursive git@github.com:clvrai/skimo.gitcd skimo

Create a virtual environment

conda create -n skimo_venv python=3.9conda activate skimo_venv

Install MuJoCo 2.1

Download the MuJoCo version 2.1 binaries forLinux orOSX.
Extract the downloadedmujoco210 directory into~/.mujoco/mujoco210.

Install packages

sh install.sh

Download Offline Datasets

# Navigate to the data directorymkdir data&&cd data# Mazegdown 1GWo8Vr8Xqj7CfJs7TaDsUA6ELno4grKJ# Kitchen (and mis-aligned kitchen)gdown 1Fym9prOt5Cu_I73F20cdd3lXZPhrvEsd# CALVINgdown 1g4ONf_3cNQtrZAo2uFa_t5MOopSr2DNYcd ..

Usage

Commands for SkiMo and all baselines. Results will be logged toWandB. Before running the commands below,please change the wandb entity inrun.py#L36 to match your account.

Environment

Please replace[ENV] with one ofmaze,kitchen,calvin. For mis-aligned kitchen, appendenv.task=misaligned to the downstream RL command.After pre-training, please set[PRETRAINED_CKPT] with the proper path to the checkpoint.

SkiMo (Ours)

Pre-training

python run.py --config-name skimo_[ENV] run_prefix=test gpu=0 wandb=true

You can also skip this step by downloading our pre-trained model checkpoints. See instructions inpretrained_models.md.

Downstream RL

python run.py --config-name skimo_[ENV] run_prefix=test gpu=0 wandb=true rolf.phase=rl rolf.pretrain_ckpt_path=[PRETRAINED_CKPT]

Dreamer

python run.py --config-name dreamer_config env=[ENV] run_prefix=test gpu=0 wandb=true

TD-MPC

python run.py --config-name tdmpc_config env=[ENV] run_prefix=test gpu=0 wandb=true

SPiRL

Need to first pre-train or download the skill prior (see instructionshere).
Downstream RL

python run.py --config-name spirl_config env=[ENV] run_prefix=test gpu=0 wandb=true

SPiRL+Dreamer

Downstream RL

python run.py --config-name spirl_dreamer_[ENV] run_prefix=test gpu=0 wandb=true

SPiRL+TD-MPC

Downstream RL

python run.py --config-name spirl_tdmpc_[ENV] run_prefix=test gpu=0 wandb=true

SkiMo+SAC

Downstream RL

python run.py --config-name skimo_[ENV] run_prefix=sac gpu=0 wandb=true rolf.phase=rl rolf.use_cem=false rolf.n_skill=1 rolf.prior_reg_critic=true rolf.sac=true rolf.pretrain_ckpt_path=[PRETRAINED_CKPT]

SkiMo w/o joint training

Pre-training

python run.py --config-name skimo_[ENV] run_prefix=no_joint gpu=0 wandb=true rolf.joint_training=false

Downstream RL

python run.py --config-name skimo_[ENV] run_prefix=no_joint gpu=0 wandb=true rolf.joint_training=false rolf.phase=rl rolf.pretrain_ckpt_path=[PRETRAINED_CKPT]

Troubleshooting

Failed building wheel for mpi4py

Solution: installmpi4py with conda instead, which requires a lower version of python.

conda install python==3.8conda install mpi4py

Now you can re-runsh install.sh.

MacOS mujoco-py compilation error

Seethis. In my case, I needed to change/usr/local/ to/opt/homebrew/ in all paths.

Citation

If you find our code useful for your research, please cite:

@inproceedings{shi2022skimo,  title={Skill-based Model-based Reinforcement Learning},  author={Lucy Xiaoyang Shi and Joseph J. Lim and Youngwoon Lee},  booktitle={Conference on Robot Learning},  year={2022}}

References

This code is based on Youngwoon's robot-learning repo:https://github.com/youngwoon/robot-learning
SPiRL:https://github.com/clvrai/spirl
TD-MPC:https://github.com/nicklashansen/tdmpc
Dreamer:https://github.com/danijar/dreamer
D4RL:https://github.com/rail-berkeley/d4rl
CALVIN:https://github.com/mees/calvin

About

Skill-based Model-based Reinforcement Learning (CoRL 2022)

clvrai.com/skimo

Releases

No releases published

Packages

No packages published

Languages

Python98.9%
Other1.1%

Movatterモバイル変換

clvrai/skimo