APPLICATION OF META-LEARNING IN MULTI-AGENT REINFORCEMENT LEARNING - A SURVEY
Volume 2, Issue 8, Pp 1-6, 2024
DOI: 10.61784/tsshr3075
Author(s)
YanQiao Ji
Affiliation(s)
Liaoning Equipment Manufacturing Vocational and Technical College, Shenyang 110161, Liaoning, China.
Corresponding Author
YanQiao Ji
ABSTRACT
This survey provides an comprehensive overview of the application of meta-learning in the field of multi-agent reinforcement learning (MARL). Meta-learning, also known as learning to learn, has emerged as a promising approach to enhance the learning efficiency and adaptability of reinforcement learning algorithms. This article explores the challenges and opportunities in applying meta-learning to MARL, highlighting the potential benefits such as faster convergence, improved generalization, and better coordination among agents.
KEYWORDS
Meta-learning; Reinforcement learning; Artificial intelligence
CITE THIS PAPER
YanQiao Ji. Application of meta-learning in multi-agent reinforcement learning - A survey. Trends in Social Sciences and Humanities Research. 2024, 2(8): 1-6. DOI: 10.61784/tsshr3075.
REFERENCES
[1] Vanschoren, J. Meta-Learning: A Survey. arXivOctober8, 2018. DOI: https://doi.org/10.48550/arXiv.1810.03548.
[2] Francois-Lavet, V, Henderson, P, Islam, R, et al. An Introduction to Deep Reinforcement Learning. Found. Trends Mach. Learn, 2018, 11(3-4): 219–354. DOI: https://doi.org/10.1561/2200000071.
[3] Tampuu, A, Matiisen, T, Kodelja, D, et al. Multiagent Cooperation and Competition with Deep Reinforcement Learning. Plos One, 2017, 12(4): e0172395. DOI: https://doi.org/10.1371/journal.pone.0172395.
[4] Sunehag, P, Lever, G, Gruslys, A, et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning. arXiv June 16, 2017. DOI: https://doi.org/10.48550/arXiv.1706.05296.
[5] Littman, ML. Markov Games as a Framework for Multi-Agent Reinforcement Learning. In Proceedings of the Eleventh International Conference on International Conference on Machine Learning; ICML’94; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 1994, 157-163.
[6] Hu, J, Wellman, MP. Nash Q-Learning for General-Sum Stochastic Games. J. Mach. Learn. Res, 2003, 4 (null), 1039-1069.
[7] Claus, C, Boutilier, C. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems. In Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence; AAAI ’98/IAAI ’98; American Association for Artificial Intelligence: USA, 1998, 746-752.
[8] Kapetanakis, S, Kudenko, D. Reinforcement Learning of Coordination in Cooperative Multi-Agent Systems. In Eighteenth national conference on Artificial intelligence; American Association for Artificial Intelligence: USA, 2002, 326-331.
[9] Konda, V, Tsitsiklis, J. Actor-Critic Algorithms. In Advances in Neural Information Processing Systems. MIT Press, 1999, 12.
[10] Zhang, Q, Chen, D. A Meta-Gradient Approach to Learning Cooperative Multi-Agent Communication Topology. 5th Workshop on Meta-Learning at NeurIPS 2021. 2021.
[11] Yang, J, Wang, E, Trivedi, R, et al. Adaptive Incentive Design with Multi-Agent Meta-Gradient Reinforcement Learning. In Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems; AAMAS’ 22; International Foundation for Autonomous Agents and Multiagent Systems: Richland, SC, 2022, 1436-1445.
[12] Gerstgrasser, M, Parkes, D. C. Meta-RL for Multi-Agent RL: Learning to Adapt to Evolving Agents. Workshop: NeurIPS 2022 Workshop on Meta-Learning. 2022.
[13] Papoudakis, G, Albrecht, SV. Variational Autoencoders for Opponent Modeling in Multi-Agent Systems. 2020. DOI: https://doi.org/10.48550/arXiv.2001.10829.
[14] He, J ZY, Erickson, Z, Brown, DS, et al. Learning Representations That Enable Generalization in Assistive Tasks. Proceedings of The 6th Conference on Robot Learning, PMLR, 2023, 205: 2105-2114.
[15] Stone, P, Kaminka, GA, Kraus, S, et al. Ad Hoc Autonomous Agent Teams: Collaboration without Pre-Coordination. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence; AAAI’10; AAAI Press: Atlanta, Georgia, 2010, 1504-1509.
[16] Kim, DK, Liu, M, Riemer, MD, et al. A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning. In Proceedings of the 38th International Conference on Machine Learning, PMLR, 2021, 5541-5550.
[17] Foerster, J- Chen, RY, Al-Shedivat, M, et al. Learning with Opponent-Learning Awareness. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems; AAMAS’ 18; International Foundation for Autonomous Agents and Multiagent Systems: Richland, SC, 2018, 122-130.
[18] Lu, C, Willi, T, Letcher, A, et al. Adversarial Cheap Talk. In Proceedings of the 40th International Conference on Machine Learning; ICML’ 23; JMLR.org: Honolulu, Hawaii, USA, 2023, 202, 22917-22941.
[19] Gupta, A, Lanctot, M, Lazaridou, A. Dynamic Population-Based Meta-Learning for Multi-Agent Communication with Natural Language. In Advances in Neural Information Processing Systems; Curran Associates, Inc., 2021, 34, 16899-16912.