Advancing DRL Agents in Commercial Fighting Games: Training, Integration, and Agent-Human Alignment
Chen Zhang, Qiang He, Zhou Yuan, Elvis S. Liu, Hong Wang, Jian Zhao, Yang Wang
TL;DR
The paper tackles deploying DRL agents in a large-scale commercial fighting game context, where hundreds of characters and real-time interaction pose training and generalization challenges. It introduces Shūkai, a unified model with Heterogeneous League Training (HELT) that combines three input structures (FIS, QS, FQS) and multi-style rewards to balance competence, generalization, and human alignment. HELT accelerates training and broadens policy coverage, achieving a 22% gain in training efficiency, while generalization remains robust across unseen characters. Real-world deployment in Naruto Mobile demonstrates tangible benefits in player engagement and retention, advancing the practical integration of DRL agents into large-scale commercial games.
Abstract
Deep Reinforcement Learning (DRL) agents have demonstrated impressive success in a wide range of game genres. However, existing research primarily focuses on optimizing DRL competence rather than addressing the challenge of prolonged player interaction. In this paper, we propose a practical DRL agent system for fighting games named Shūkai, which has been successfully deployed to Naruto Mobile, a popular fighting game with over 100 million registered users. Shūkai quantifies the state to enhance generalizability, introducing Heterogeneous League Training (HELT) to achieve balanced competence, generalizability, and training efficiency. Furthermore, Shūkai implements specific rewards to align the agent's behavior with human expectations. Shūkai's ability to generalize is demonstrated by its consistent competence across all characters, even though it was trained on only 13% of them. Additionally, HELT exhibits a remarkable 22% improvement in sample efficiency. Shūkai serves as a valuable training partner for players in Naruto Mobile, enabling them to enhance their abilities and skills.
