APPLICATION OF META-LEARNING IN MULTI-AGENT REINFORCEMENT LEARNING - A SURVEY
Keywords:
Meta-learning, Reinforcement learning, Artificial intelligenceAbstract
This survey provides an comprehensive overview of the application of meta-learning in the field of multi-agent reinforcement learning (MARL). Meta-learning, also known as learning to learn, has emerged as a promising approach to enhance the learning efficiency and adaptability of reinforcement learning algorithms. This article explores the challenges and opportunities in applying meta-learning to MARL, highlighting the potential benefits such as faster convergence, improved generalization, and better coordination among agents.References
[1] Vanschoren, J. Meta-Learning: A Survey. arXivOctober8, 2018. DOI: https://doi.org/10.48550/arXiv.1810.03548.
[2] Francois-Lavet, V, Henderson, P, Islam, R, et al. An Introduction to Deep Reinforcement Learning. Found. Trends Mach. Learn, 2018, 11(3-4): 219–354. DOI: https://doi.org/10.1561/2200000071.
[3] Tampuu, A, Matiisen, T, Kodelja, D, et al. Multiagent Cooperation and Competition with Deep Reinforcement Learning. Plos One, 2017, 12(4): e0172395. DOI: https://doi.org/10.1371/journal.pone.0172395.
[4] Sunehag, P, Lever, G, Gruslys, A, et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning. arXiv June 16, 2017. DOI: https://doi.org/10.48550/arXiv.1706.05296.
[5] Littman, ML. Markov Games as a Framework for Multi-Agent Reinforcement Learning. In Proceedings of the Eleventh International Conference on International Conference on Machine Learning; ICML’94; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 1994, 157-163.
[6] Hu, J, Wellman, MP. Nash Q-Learning for General-Sum Stochastic Games. J. Mach. Learn. Res, 2003, 4 (null), 1039-1069.
[7] Claus, C, Boutilier, C. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems. In Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence; AAAI ’98/IAAI ’98; American Association for Artificial Intelligence: USA, 1998, 746-752.
[8] Kapetanakis, S, Kudenko, D. Reinforcement Learning of Coordination in Cooperative Multi-Agent Systems. In Eighteenth national conference on Artificial intelligence; American Association for Artificial Intelligence: USA, 2002, 326-331.
[9] Konda, V, Tsitsiklis, J. Actor-Critic Algorithms. In Advances in Neural Information Processing Systems. MIT Press, 1999, 12.
[10] Zhang, Q, Chen, D. A Meta-Gradient Approach to Learning Cooperative Multi-Agent Communication Topology. 5th Workshop on Meta-Learning at NeurIPS 2021. 2021.
[11] Yang, J, Wang, E, Trivedi, R, et al. Adaptive Incentive Design with Multi-Agent Meta-Gradient Reinforcement Learning. In Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems; AAMAS’ 22; International Foundation for Autonomous Agents and Multiagent Systems: Richland, SC, 2022, 1436-1445.
[12] Gerstgrasser, M, Parkes, D. C. Meta-RL for Multi-Agent RL: Learning to Adapt to Evolving Agents. Workshop: NeurIPS 2022 Workshop on Meta-Learning. 2022.
[13] Papoudakis, G, Albrecht, SV. Variational Autoencoders for Opponent Modeling in Multi-Agent Systems. 2020. DOI: https://doi.org/10.48550/arXiv.2001.10829.
[14] He, J ZY, Erickson, Z, Brown, DS, et al. Learning Representations That Enable Generalization in Assistive Tasks. Proceedings of The 6th Conference on Robot Learning, PMLR, 2023, 205: 2105-2114.
[15] Stone, P, Kaminka, GA, Kraus, S, et al. Ad Hoc Autonomous Agent Teams: Collaboration without Pre-Coordination. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence; AAAI’10; AAAI Press: Atlanta, Georgia, 2010, 1504-1509.
[16] Kim, DK, Liu, M, Riemer, MD, et al. A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning. In Proceedings of the 38th International Conference on Machine Learning, PMLR, 2021, 5541-5550.
[17] Foerster, J- Chen, RY, Al-Shedivat, M, et al. Learning with Opponent-Learning Awareness. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems; AAMAS’ 18; International Foundation for Autonomous Agents and Multiagent Systems: Richland, SC, 2018, 122-130.
[18] Lu, C, Willi, T, Letcher, A, et al. Adversarial Cheap Talk. In Proceedings of the 40th International Conference on Machine Learning; ICML’ 23; JMLR.org: Honolulu, Hawaii, USA, 2023, 202, 22917-22941.
[19] Gupta, A, Lanctot, M, Lazaridou, A. Dynamic Population-Based Meta-Learning for Multi-Agent Communication with Natural Language. In Advances in Neural Information Processing Systems; Curran Associates, Inc., 2021, 34, 16899-16912.