Part of the book series:Lecture Notes in Computer Science ((LNAI,volume 7188))
Included in the following conference series:
2346Accesses
Abstract
This paper compares and investigates single-agent reinforcement learning (RL) algorithms on the simple and an extended taxi problem domain, and multiagent RL algorithms on a multiagent extension of the simple taxi problem domain we created. In particular, we extend the Policy Hill Climbing (PHC) and the Win or Learn Fast-PHC (WoLF-PHC) algorithms by combining them with the MAXQ hierarchical decomposition and investigate their efficiency. The results are very promising for the multiagent domain as they indicate that these two newly-created algorithms are the most efficient ones from the algorithms we compared.
This is a preview of subscription content,log in via an institution to check access.
Access this chapter
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
- Chapter
- JPY 3498
- Price includes VAT (Japan)
- eBook
- JPY 5719
- Price includes VAT (Japan)
- Softcover Book
- JPY 7149
- Price includes VAT (Japan)
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Andre, D., Russell, S.J.: In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) Advances in Neural Information Processing Systems 13 (NIPS 2000), pp. 1019–1025. MIT Press, Cambridge (2001)
Bowling, M.H., Veloso, M.M.: Artificial Intelligence 136(2), 215–250 (2002)
Dayan, P., Hinton, G.E.: In: Hanson, S.J., Cowan, J.D., Giles, C.L. (eds.) Advances in Neural Information Processing Systems 5 (NIPS 1992), pp. 271–278. Morgan Kaufmann, San Francisco (1993)
Dietterich, T.G.: Journal of Artificial Intelligence Research 13, 227–303 (2000)
Diuk, C., Cohen, A., Littman, M.L.: In: Cohen, W.W., McCallum, A., Roweis, S.T. (eds.) Proceedings of the 25th International Conference on Machine Learning (ICML 2008), pp. 240–247. ACM, New York (2008)
Fitch, R., Hengst, B., Šuc, D., Calbert, G., Scholz, J.: Structural Abstraction Experiments in Reinforcement Learning. In: Zhang, S., Jarvis, R.A. (eds.) AI 2005. LNCS (LNAI), vol. 3809, pp. 164–175. Springer, Heidelberg (2005)
Ghavamzadeh, M., Mahadevan, S.: In: Proceedings of the 3rd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2004), vol. 3, pp. 1114–1121. IEEE Computer Society, Washington, DC (2004)
Hengst, B.: In: Proceedings of the 19th International Conference on Machine Learning (ICML 2002), pp. 243–250. Morgan Kaufmann, San Francisco (2002)
Hwang, K.-S., Lin, C.-J., Wu, C.-J., Lo, C.-Y.: Cooperation Between Multiple Agents Based on Partially Sharing Policy. In: Huang, D.-S., Heutte, L., Loog, M. (eds.) ICIC 2007. LNCS, vol. 4681, pp. 422–432. Springer, Heidelberg (2007)
Kaelbling, L.P.: In: Proceedings of the 10th International Conference on Machine Learning (ICML 1993), pp. 167–173. Morgan Kaufmann, San Francisco (1993)
Mehta, N., Tadepalli, P., Fern, A.: In: Driessens, K., Fern, A., van Otterlo, M. (eds.) Proceedings of the ICML 2005 Workshop on Rich Representations for Reinforcement Learning, Bonn, Germany, pp. 45–50 (2005)
Mehta, N., Ray, S., Tadepalli, P., Dietterich, T.: In: Proceedings of the 25th International Conference on Machine Learning (ICML 2008), pp. 648–655. ACM, New York (2008)
Mirzazadeh, F., Behsaz, B., Beigy, H.: In: Proceedings of the International Conference on Information and Communication Technology (ICICT 2007), pp. 105–108 (2007)
Parr, R.: Hierarchical control and learning for Markov decision processes. Ph.D. thesis, University of California at Berkeley (1998)
Rummery, G.A., Niranjan, M.: On-line Q-learning using connectionist systems. Tech. Rep. CUED/F-INFENG/TR 166, Cambridge University (1994)
Shen, J., Liu, H., Gu, G.: In: Yao, Y., Shi, Z., Wang, Y., Kinsner, W. (eds.) Proceedings of the 5th International Conference on Cognitive Informatics (ICCI 2006), pp. 584–588. IEEE (2006)
Singh, S.P.: Machine Learning 8, 323–339 (1992)
Sutton, R.S., Precup, D., Singh, S.: Artificial Intelligence 112, 181–211 (1999)
Thrun, S.B.: Efficient Exploration in Reinforcement Learning. Tech. Rep. CMU-CS-92-102, Carnegie Mellon University, Pittsburgh, PA (1992)
Watkins, C.J.C.H.: Learning from delayed rewards. Ph.D. thesis, University of Cambridge (1989)
Wiering, M., Schmidhuber, J.: Adaptive Behavior 6, 219–246 (1998)
Author information
Authors and Affiliations
Department of Computer Science, University of Cyprus, Nicosia, 1678, Cyprus
Ioannis Lambrou, Vassilis Vassiliades & Chris Christodoulou
- Ioannis Lambrou
You can also search for this author inPubMed Google Scholar
- Vassilis Vassiliades
You can also search for this author inPubMed Google Scholar
- Chris Christodoulou
You can also search for this author inPubMed Google Scholar
Editor information
Editors and Affiliations
NICTA and the Australian National University, 7 London Circuit, ACT 2601, Canberra, Australia
Scott Sanner
Research School of Computer Science, Australian National University, ACT 0200, Canberra, Australia
Marcus Hutter
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lambrou, I., Vassiliades, V., Christodoulou, C. (2012). An Extension of a Hierarchical Reinforcement Learning Algorithm for Multiagent Settings. In: Sanner, S., Hutter, M. (eds) Recent Advances in Reinforcement Learning. EWRL 2011. Lecture Notes in Computer Science(), vol 7188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29946-9_26
Download citation
Publisher Name:Springer, Berlin, Heidelberg
Print ISBN:978-3-642-29945-2
Online ISBN:978-3-642-29946-9
eBook Packages:Computer ScienceComputer Science (R0)
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative