Movatterモバイル変換

Part of the book series:Lecture Notes in Computer Science ((LNAI,volume 7188))

Included in the following conference series:

European Workshop on Reinforcement Learning

2346Accesses

Abstract

This paper compares and investigates single-agent reinforcement learning (RL) algorithms on the simple and an extended taxi problem domain, and multiagent RL algorithms on a multiagent extension of the simple taxi problem domain we created. In particular, we extend the Policy Hill Climbing (PHC) and the Win or Learn Fast-PHC (WoLF-PHC) algorithms by combining them with the MAXQ hierarchical decomposition and investigate their efficiency. The results are very promising for the multiagent domain as they indicate that these two newly-created algorithms are the most efficient ones from the algorithms we compared.

This is a preview of subscription content,log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 5719; Price includes VAT (Japan)

Softcover Book: JPY 7149; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Hierarchical Method for Cooperative Multiagent Reinforcement Learning in Markov Decision Processes

Article01 December 2023

Multiagent Reinforcement Learning

Learning in the Presence of Multiple Agents

References

Andre, D., Russell, S.J.: In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) Advances in Neural Information Processing Systems 13 (NIPS 2000), pp. 1019–1025. MIT Press, Cambridge (2001)
Google Scholar
Bowling, M.H., Veloso, M.M.: Artificial Intelligence 136(2), 215–250 (2002)
Google Scholar
Dayan, P., Hinton, G.E.: In: Hanson, S.J., Cowan, J.D., Giles, C.L. (eds.) Advances in Neural Information Processing Systems 5 (NIPS 1992), pp. 271–278. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Dietterich, T.G.: Journal of Artificial Intelligence Research 13, 227–303 (2000)
Google Scholar
Diuk, C., Cohen, A., Littman, M.L.: In: Cohen, W.W., McCallum, A., Roweis, S.T. (eds.) Proceedings of the 25th International Conference on Machine Learning (ICML 2008), pp. 240–247. ACM, New York (2008)
Chapter Google Scholar
Fitch, R., Hengst, B., Šuc, D., Calbert, G., Scholz, J.: Structural Abstraction Experiments in Reinforcement Learning. In: Zhang, S., Jarvis, R.A. (eds.) AI 2005. LNCS (LNAI), vol. 3809, pp. 164–175. Springer, Heidelberg (2005)
Chapter Google Scholar
Ghavamzadeh, M., Mahadevan, S.: In: Proceedings of the 3rd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2004), vol. 3, pp. 1114–1121. IEEE Computer Society, Washington, DC (2004)
Google Scholar
Hengst, B.: In: Proceedings of the 19th International Conference on Machine Learning (ICML 2002), pp. 243–250. Morgan Kaufmann, San Francisco (2002)
Google Scholar
Hwang, K.-S., Lin, C.-J., Wu, C.-J., Lo, C.-Y.: Cooperation Between Multiple Agents Based on Partially Sharing Policy. In: Huang, D.-S., Heutte, L., Loog, M. (eds.) ICIC 2007. LNCS, vol. 4681, pp. 422–432. Springer, Heidelberg (2007)
Chapter Google Scholar
Kaelbling, L.P.: In: Proceedings of the 10th International Conference on Machine Learning (ICML 1993), pp. 167–173. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Mehta, N., Tadepalli, P., Fern, A.: In: Driessens, K., Fern, A., van Otterlo, M. (eds.) Proceedings of the ICML 2005 Workshop on Rich Representations for Reinforcement Learning, Bonn, Germany, pp. 45–50 (2005)
Google Scholar
Mehta, N., Ray, S., Tadepalli, P., Dietterich, T.: In: Proceedings of the 25th International Conference on Machine Learning (ICML 2008), pp. 648–655. ACM, New York (2008)
Google Scholar
Mirzazadeh, F., Behsaz, B., Beigy, H.: In: Proceedings of the International Conference on Information and Communication Technology (ICICT 2007), pp. 105–108 (2007)
Google Scholar
Parr, R.: Hierarchical control and learning for Markov decision processes. Ph.D. thesis, University of California at Berkeley (1998)
Google Scholar
Rummery, G.A., Niranjan, M.: On-line Q-learning using connectionist systems. Tech. Rep. CUED/F-INFENG/TR 166, Cambridge University (1994)
Google Scholar
Shen, J., Liu, H., Gu, G.: In: Yao, Y., Shi, Z., Wang, Y., Kinsner, W. (eds.) Proceedings of the 5th International Conference on Cognitive Informatics (ICCI 2006), pp. 584–588. IEEE (2006)
Google Scholar
Singh, S.P.: Machine Learning 8, 323–339 (1992)
MATH Google Scholar
Sutton, R.S., Precup, D., Singh, S.: Artificial Intelligence 112, 181–211 (1999)
Article MathSciNet MATH Google Scholar
Thrun, S.B.: Efficient Exploration in Reinforcement Learning. Tech. Rep. CMU-CS-92-102, Carnegie Mellon University, Pittsburgh, PA (1992)
Google Scholar
Watkins, C.J.C.H.: Learning from delayed rewards. Ph.D. thesis, University of Cambridge (1989)
Google Scholar
Wiering, M., Schmidhuber, J.: Adaptive Behavior 6, 219–246 (1998)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Cyprus, Nicosia, 1678, Cyprus
Ioannis Lambrou, Vassilis Vassiliades & Chris Christodoulou

Authors

Ioannis Lambrou
View author publications
You can also search for this author inPubMed Google Scholar
Vassilis Vassiliades
View author publications
You can also search for this author inPubMed Google Scholar
Chris Christodoulou
View author publications
You can also search for this author inPubMed Google Scholar

Editor information

Editors and Affiliations

NICTA and the Australian National University, 7 London Circuit, ACT 2601, Canberra, Australia
Scott Sanner
Research School of Computer Science, Australian National University, ACT 0200, Canberra, Australia
Marcus Hutter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lambrou, I., Vassiliades, V., Christodoulou, C. (2012). An Extension of a Hierarchical Reinforcement Learning Algorithm for Multiagent Settings. In: Sanner, S., Hutter, M. (eds) Recent Advances in Reinforcement Learning. EWRL 2011. Lecture Notes in Computer Science(), vol 7188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29946-9_26

Download citation

DOI:https://doi.org/10.1007/978-3-642-29946-9_26
Publisher Name:Springer, Berlin, Heidelberg
Print ISBN:978-3-642-29945-2
Online ISBN:978-3-642-29946-9
eBook Packages:Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Movatterモバイル変換

An Extension of a Hierarchical Reinforcement Learning Algorithm for Multiagent Settings

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Hierarchical Method for Cooperative Multiagent Reinforcement Learning in Markov Decision Processes

Multiagent Reinforcement Learning

Learning in the Presence of Multiple Agents

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Access this chapter

Subscribe and save

Buy Now