- Notifications
You must be signed in to change notification settings - Fork5
mostafaelaraby/challenge-aido_LF-baseline-dagger-pytorch
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
In this baseline we train a small squeezenet model on expert trajectories to simply clone the behavior of the expert.Using only the expert trajectories would result in a model unable to recover from non-optimal positions; Instead, we use a technique called DAgger: a dataset aggregation technique with mixed policies between expert and model.
Use the jupyter notebook notebook.ipynb to quickly start training and testing the imitation learning Dagger.
Clone thisrepo:
$ git clonehttps://github.com/duckietown/gym-duckietown.git
$ cd gym-duckietown
$ pip3 install -e .
$ python -m learning.train
--episode
or-i
an integer specifying the number of episodes to train the agent, defaults to 10.--horizon
or-r
an integer specifying the length of the horizon in each episode, defaults to 64.--learning-rate
or-l
integer specifying the index from the list [1e-1, 1e-2, 1e-3, 1e-4, 1e-5] to select the learning rate, defaults to 2.--decay
or-d
integer specifying the index from the list [0.5, 0.6, 0.7, 0.8, 0.85, 0.9, 0.95] to select the initial probability to choose the teacher, the learner.--save-path
or-s
string specifying the path where to save the trained model, models will be overwritten to keep latest episode, defaults to a file named iil_baseline.pt on the project root.--map-name
or-m
string specifying which map to use for training, defaults to loop_empty.--num-outputs
integer specifying the number of outputs the model will have, can be modified to train only angular speed, defaults to 2 for both linear and angular speed.--domain-rand
or-dr
a flag to enable domain randomization for the transferability to real world from simulation.--randomize-map
or-rm
a flag to randomize training maps on reset.
$ python -m learning.test
--model-path
or-mp
string specifying the path to the saved model to be used in testing.--episode
or-i
an integer specifying the number of episodes to test the agent, defaults to 10.--horizon
or-r
an integer specifying the length of the horizon in each episode, defaults to 64.--save-path
or-s
string specifying the path where to save the trained model, models will be overwritten to keep latest episode, defaults to a file named iil_baseline.pt on the project root.--num-outputs
integer specifying the number of outputs the model has, defaults to 2.--map-name
or-m
string specifying which map to use for training, defaults to loop_empty.
- Copy trained model files into submission/models directory and then useduckietown shell to submit.
- For more information on submitting checkduckietown shell documentation.
- We started from previous work done by Manfred Díaz as a boilerplate, and we would like to thank him for his full support with code and answering our questions.
- Mostafa ElAraby
- Ramon Emiliani
@phdthesis{diaz2018interactive, title={Interactive and Uncertainty-aware Imitation Learning: Theory and Applications}, author={Diaz Cabrera, Manfred Ramon}, year={2018}, school={Concordia University}}@inproceedings{ross2011reduction, title={A reduction of imitation learning and structured prediction to no-regret online learning}, author={Ross, St{\'e}phane and Gordon, Geoffrey and Bagnell, Drew}, booktitle={Proceedings of the fourteenth international conference on artificial intelligence and statistics}, pages={627--635}, year={2011}}@article{loquercio2018dronet, title={Dronet: Learning to fly by driving}, author={Loquercio, Antonio and Maqueda, Ana I and Del-Blanco, Carlos R and Scaramuzza, Davide}, journal={IEEE Robotics and Automation Letters}, volume={3}, number={2}, pages={1088--1095}, year={2018}, publisher={IEEE}}
About
imitation learning baseline with dataset aggregation.
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Contributors2
Uh oh!
There was an error while loading.Please reload this page.