Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
/FLEXPublic

Code for FLEX, a fast, adaptive and flexible model-based reinforcement learning exploration algorithm.

License

NotificationsYou must be signed in to change notification settings

MB-29/FLEX

Repository files navigation

We introduce a fast exploration algorithm for nonlinear system. FLEX is a lightweight model-based pure exploration policy maximizing the Fisher information.

Check outthe project page for more information.

Exploration of the pendulum environment

An animated demonstration of our algorithm in various environments can be found inthis video.

Paper

Our algorithm is described in our paperFLEX: an Adaptive Exploration Algorithm for Nonlinear Systems, accepted at ICML 2023.

To cite this work, please use the following references.

Blanke, M., & Lelarge, M. (2023). FLEX: an Adaptive Exploration Algorithm for Nonlinear Systems. arXiv preprint arXiv:2304.13426.

@article{blanke2023flex,title={FLEX: an Adaptive Exploration Algorithm for Nonlinear Systems},author={Blanke, Matthieu and Lelarge, Marc},journal={arXiv preprint arXiv:2304.13426},year={2023}}

Organization

An agent is defined by an exploration policy, which can be found inpolicies.py along with baselines. Pre-defined environments and models available in directoriesenvironments andmodels. Given an environment, an agent, a time horizon, and an evaluation function, the functionexploration runs the exploration algorithm and returns the resulting state-action values and evaluation values.

Example

The following code can be executed by runningpython exploration.py.

fromenvironments.pendulumimportDampedPendulumfrommodels.pendulumimportLinearfrompoliciesimportRandom,FlexfromexplorationimportexplorationT=300dt=1e-2environment=DampedPendulum(dt)environment=Environment()model=Model(environment)evaluation=model.evaluation# agent = Random(agent=Flex(model,environment.d,environment.m,environment.gamma,dt=environment.dt)z_values,error_values=exploration(environment,agent,T,evaluation)

[8]ページ先頭

©2009-2025 Movatter.jp