Movatterモバイル変換


[0]ホーム

URL:


Skip to main content

Advertisement

Springer Nature Link
Log in

Neural Fitted Q Iteration – First Experiences with a Data Efficient Neural Reinforcement Learning Method

  • Conference paper

Part of the book series:Lecture Notes in Computer Science ((LNAI,volume 3720))

Included in the following conference series:

Abstract

This paper introduces NFQ, an algorithm for efficient and effective training of a Q-value function represented by a multi-layer perceptron. Based on the principle of storing and reusing transition experiences, a model-free, neural network based Reinforcement Learning algorithm is proposed. The method is evaluated on three benchmark problems. It is shown empirically, that reasonably few interactions with the plant are needed to generate control policies of high quality.

Similar content being viewed by others

References

  1. Boyan, J., Moore, C.: Generalization in reinforcement learning: Safely approximating the value function. In: Advances in Neural Information Processing Systems 7. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  2. Ernst, D., Wehenkel, L., Geurts, P.: Tree-based batch mode reinforcement learning. Journal of Machine Learning Research 6, 503–556 (2005)

    MathSciNet  Google Scholar 

  3. Gordon, G.J.: Stable function approximation in dynamic programming. In: Prieditis, A., Russell, S. (eds.) Proceedings of the ICML, San Francisco, CA (1995)

    Google Scholar 

  4. Lin, L.-J.: Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning 8, 293–321 (1992)

    Google Scholar 

  5. Lagoudakis, M., Parr, R.: Least-squares policy iteration. Journal of Machine Learning Research 4, 1107–1149 (2003)

    Article MathSciNet  Google Scholar 

  6. Riedmiller, M., Braun, H.: A direct adaptive method for faster backpropagation learning: The RPROP algorithm. In: Ruspini, H. (ed.) Proceedings of the IEEE International Conference on Neural Networks (ICNN), San Francisco, pp. 586–591 (1993)

    Google Scholar 

  7. Riedmiller, M.: Concepts and facilities of a neural reinforcement learning control architecture for technical process control. Journal of Neural Computing and Application 8, 323–338 (2000)

    Article  Google Scholar 

  8. Sutton, R.S., Barto, A.G.: Reinforcement Learning. MIT Press, Cambridge (1998)

    Google Scholar 

  9. Tesauro, G.: Practical issues in temporal difference learning. Machine Learning (8), 257–277 (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Neuroinformatics Group, University of Onsabrück, 49078, Osnabrück

    Martin Riedmiller

Authors
  1. Martin Riedmiller

    You can also search for this author inPubMed Google Scholar

Editor information

Editors and Affiliations

  1. Faculty of Economics of the University of Porto, Portugal

    João Gama

  2. Faculdade de Engenharia & LIAAD, Universidade do Porto, Portugal

    Rui Camacho

  3. LIAAD-INESC Porto L.A./Faculty of Economics, University of Porto, Rua de Ceuta, 118-6, 4050-190, Porto, Portugal

    Pavel B. Brazdil

  4. LIACC/FEP, Universidade do Porto, Portugal

    Alípio Mário Jorge

  5. LIAAD-INESC Porto LA / FEP, University of Porto, R. de Ceuta, 118, 6., 4050-190, Porto, Portugal

    Luís Torgo

Rights and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Riedmiller, M. (2005). Neural Fitted Q Iteration – First Experiences with a Data Efficient Neural Reinforcement Learning Method. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds) Machine Learning: ECML 2005. ECML 2005. Lecture Notes in Computer Science(), vol 3720. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564096_32

Download citation

Publish with us


[8]ページ先頭

©2009-2025 Movatter.jp