Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Code recquired to reproduce all experiments in "Learning Environment Models with Continuous Stochastic Dynamics - with an Application to Deep RL Testing"

NotificationsYou must be signed in to change notification settings

DES-Lab/Learning-Environment-Models-with-Continuous-Stochastic-Dynamics

Repository files navigation

Learning Environment Models with Continuous Stochastic Dynamics - with an Application to Deep RL Testing

This repository contains all code required to reproduce experiments reported in "Learning Environment Models with Continuous Stochastic Dynamics" paper.

Reproducibility and Setup

For the computation of schedulers installPrism Model Checker.We have used Java12(openjdk 12.0.2) run alergia.jar and Prism.

To reproduce all experiments, we recommend that you create a new Python virtual enviroment in which you can install all recquirements.Code has been tested with Python 3.9 and Prism 4.7.

python -m venv myenvsource myenv/bin/activate // Linux and Macmyenv\Scripts\activate.bat // Windowspython -m pip install --upgrade pip // update pip

To install requirements:

pip install -r requirements.txt

Code structure

Main files:

  • main.py - an example file which can be used to learn enviromental models of well established RL benchmarks. Example on how to use our approach
  • iterative_refinement.py - file which contains code which iteratively refines learned model with respect to some goal
  • diff_testing.py - All code required to differentially test multiple RL agents

Util files:

  • trace_abstraction.py - convert high dimensional sequances to their abstract/discrete representations
  • discretization_pipeline.py - helper file with methods used to reduce data dimensions
  • schedulers.py - interface between learned MDPs and PRISM
  • visualization_util.py - visualize results of experiments
  • utils.py - minor utility functions

Learning Environment Models with Continuous Stochastic Dynamics

To reproduce an experiment, simply callexperiment_cmd_runner.py with appropriate arguments, as shown in the following line:

python experiment_cmd_runner.py --path_to_prism "C:/Program Files/prism-4.7/bin/prism.bat" --path_to_alergia alergia.jar --env_name Acrobot --dim_reduction manual --num_initial_traces 2500 --num_clusters 256 --num_iterations 10 --episodes_in_iter 10 --exp_prefix exp_1_ --seed 101

Alternatively, you can change variable values inmain.py and execute any experiment from there.

To set a constant random seed for reproducibility, simply define an --seed argument or set the seed in the 'main.py' file.

To visualize the plots found in the paper, runvisualization_util.py. Visualization_util can also be used to visualize new runs.

Output structure

Outputs of iterative refinements will be printed to console as the algorithm progresses, and after every refinement iteration multiple values will be saved to a pickle, so that experiments can be reproduced afterwards.For more details, check the bottom ofiteratively_refine_model function initerative_refinement.py.

Differential Testing

All code required to differentially test 2 agents with CASTLE is found indiff_testing.py.When running the script, please replacepath_to_prism in line 26 with appropriate install path.

To run differential testing with learned models (and during differential testing fined tuned), simply rundiff_testing.py. To switch between LunarLander and Cartpole experiments, change the value of experiment variable in line 313.

For each cluster of interest, results of differential testing will be saved to a pickle and .txt file found inpickles/diff_testing/ folder.

Differential testing plots found in the paper can be visualized withpickles/diff_testing/paper_diff_results/saftey_test_plots.py.

About

Code recquired to reproduce all experiments in "Learning Environment Models with Continuous Stochastic Dynamics - with an Application to Deep RL Testing"

Resources

Stars

Watchers

Forks

Languages


[8]ページ先頭

©2009-2025 Movatter.jp