Movatterモバイル変換


[0]ホーム

URL:


[International Conference on Machine Learning Logo] Proceedings of Machine Learning Research

[edit]

Implicit Quantile Networks for Distributional Reinforcement Learning

Will Dabney, Georg Ostrovski, David Silver, Remi Munos
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:1096-1105, 2018.

Abstract

In this work, we build on recent advances in distributional reinforcement learning to give a generally applicable, flexible, and state-of-the-art distributional variant of DQN. We achieve this by using quantile regression to approximate the full quantile function for the state-action return distribution. By reparameterizing a distribution over the sample space, this yields an implicitly defined return distribution and gives rise to a large class of risk-sensitive policies. We demonstrate improved performance on the 57 Atari 2600 games in the ALE, and use our algorithm’s implicitly defined distributions to study the effects of risk-sensitive policies in Atari games.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-dabney18a, title = {Implicit Quantile Networks for Distributional Reinforcement Learning}, author = {Dabney, Will and Ostrovski, Georg and Silver, David and Munos, Remi}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {1096--1105}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/dabney18a/dabney18a.pdf}, url = {https://proceedings.mlr.press/v80/dabney18a.html}, abstract = {In this work, we build on recent advances in distributional reinforcement learning to give a generally applicable, flexible, and state-of-the-art distributional variant of DQN. We achieve this by using quantile regression to approximate the full quantile function for the state-action return distribution. By reparameterizing a distribution over the sample space, this yields an implicitly defined return distribution and gives rise to a large class of risk-sensitive policies. We demonstrate improved performance on the 57 Atari 2600 games in the ALE, and use our algorithm’s implicitly defined distributions to study the effects of risk-sensitive policies in Atari games.}}
Endnote
%0 Conference Paper%T Implicit Quantile Networks for Distributional Reinforcement Learning%A Will Dabney%A Georg Ostrovski%A David Silver%A Remi Munos%B Proceedings of the 35th International Conference on Machine Learning%C Proceedings of Machine Learning Research%D 2018%E Jennifer Dy%E Andreas Krause%F pmlr-v80-dabney18a%I PMLR%P 1096--1105%U https://proceedings.mlr.press/v80/dabney18a.html%V 80%X In this work, we build on recent advances in distributional reinforcement learning to give a generally applicable, flexible, and state-of-the-art distributional variant of DQN. We achieve this by using quantile regression to approximate the full quantile function for the state-action return distribution. By reparameterizing a distribution over the sample space, this yields an implicitly defined return distribution and gives rise to a large class of risk-sensitive policies. We demonstrate improved performance on the 57 Atari 2600 games in the ALE, and use our algorithm’s implicitly defined distributions to study the effects of risk-sensitive policies in Atari games.
APA
Dabney, W., Ostrovski, G., Silver, D. & Munos, R.. (2018). Implicit Quantile Networks for Distributional Reinforcement Learning.Proceedings of the 35th International Conference on Machine Learning, inProceedings of Machine Learning Research 80:1096-1105 Available from https://proceedings.mlr.press/v80/dabney18a.html.

Related Material


[8]ページ先頭

©2009-2025 Movatter.jp