2023 International Conference on High Performance Big Data and Intelligent Systems (HDIS)
Download PDF

Abstract

In recent years, the exploration of multi-agent collaborative games using deep reinforcement learning has emerged as a prominent and active area within artificial intelligence research. This paper contributes to this field by investigating the on-policy Multi-Agent Proximal Policy Optimization (MAPPO) algorithm, building upon prior studies to examine its efficacy in multi-agent collaborative gaming scenarios, thus providing fresh perspectives for subsequent research endeavors. Utilizing the Hanabi game environment, this study implements the MAPPO algorithm with a suitably designed action space, aiming to maximize the collaborative efficiency and competitive capabilities among agents. Empirical findings reveal that the MAPPO algorithm exhibits notable performance in the context of collaborative gaming. Comparative analysis with the off-policy Value-Decomposition Networks (VDN) algorithm [1] illustrates the superior decision-making efficiency and overall outcomes achieved by intelligent agents through the application of the MAPPO approach. This investigation underscores the viability and advantages of employing the MAPPO algorithm in the domain of multi-agent collaborative games. Moreover, the experiment not only explores the practical application of the MAPPO algorithm within multi-agent collaborative gaming but also provides valuable insights for the enhancement of reinforcement learning algorithms and their real-world implementations. Furthermore, this study raises new inquiries while furnishing guidance and inspiration for future researchers in this burgeoning field.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles