DES-Lab/Q-learning-under-Partial-ObservabilityPublic

NotificationsYou must be signed in to change notification settings
Fork0
Star0

You must be signed in to change notification settings

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
gym_partially_observable_grid		gym_partially_observable_grid
learned_models		learned_models
statistics		statistics
worlds		worlds
.gitignore		.gitignore
README.md		README.md
alergia.jar		alergia.jar
experiment_runner.py		experiment_runner.py
partially_observable_q_learning.py		partially_observable_q_learning.py
pomdp_approximation_demo.py		pomdp_approximation_demo.py
q_learning.py		q_learning.py
reccurent_policy_comp.py		reccurent_policy_comp.py
recquirements.txt		recquirements.txt
stacked_frames_comp.py		stacked_frames_comp.py
utils.py		utils.py
world_repository.py		world_repository.py

Repository files navigation

Reinforcement Learning under Partial Observability Guided by Learned Environment Models

Q^A-learning is a method of finding a policy in partially observable environments. It uses IoAlergia to approximate the underlyingPOMDP and with the learned model it extends the state space.

This repository containing implementation of Q^A-learning and experiments found in the paper.

Installation Process

Due to the older version of tensorflow reacquired by stable baselines v.2, ensure that you have Python 3.6 installed.Note that our algorithm works with Python 3.6 and newer versions, but for the sake of comparison use Python3.6.

With python 3.6 create thevirtual enviroment:

python3.6 -m venv .source myvenv/bin/activate // Linux and Macmyenv\Scripts\activate.bat // Windowspython -m pip install --upgrade pip setuptools // to ensure that tensorflow 1.15 will be found

Install reacquired dependencies for comparison.With a one liner

pip install -r recquirements.txt

Or install each dependency individually in case if you want to use your GPU for comparison.

pip install aalpypip install stable-baselinespip install tensorflow==1.15 (or tensorflow-gpu==1.15)pip install numpy==1.16.4pip install gym==0.15.7

To make Alergia faster, we interface tojAlergia with AALpy.Ensure that you have java added to the path.If you have Java >= 12, providedalergia.jar should work out of the box.If you have lower version of Java added to your path , please compile your own .jar file and replace the one present in the reposatory.

git clone https://github.com/emuskardin/jAlergiagradlew jar# gradlew.bat on Windows

Approximate POMDP with finite-state deterministic MDP

To see an example how active or passive automata learning methods can be used to approximate a POMDP run:

python pomdp_approximation_demo.py

Run experiments

To run each experiment, simply call call the appropriate python script with experiment name.Experiment names found in paper areoficeWorld,confusingOfficeWorld,gravity,thinMaze.Due to stochasicity inherent in the environment and in reinforcement learning, run each experiments multiple times toobtain a better picture and more optimal results.

python partially_observable_q_learning.py <exp_name>python reccurent_policy_comp.py <exp_name>python stacked_frames_comp.py <exp_name>

About

No description, website, or topics provided.

Releases

No releases published

Packages

No packages published

Languages

Python100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Reinforcement Learning under Partial Observability Guided by Learned Environment Models

Installation Process

Approximate POMDP with finite-state deterministic MDP

Run experiments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Languages

Movatterモバイル変換

DES-Lab/Q-learning-under-Partial-Observability

Folders and files

Latest commit

History

Repository files navigation

Reinforcement Learning under Partial Observability Guided by Learned Environment Models

Installation Process

Approximate POMDP with finite-state deterministic MDP

Run experiments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Languages

Packages