BattleAgent: Multi-modal Dynamic Emulation on Historical Battles to Complement Historical Analysis
Shuhang Lin, Wenyue Hua, Lingyao Li, Che-Jui Chang, Lizhou Fan, Jianchao Ji, Hang Hua, Mingyu Jin, Jiebo Luo, Yongfeng Zhang
TL;DR
BattleAgent addresses the lack of micro-level historical insight by integrating large vision-language models with a dynamic multi-agent system to simulate historical battles. The framework emulates both leadership decision-making and ordinary soldiers' experiences across four medieval battles, featuring 30 detailed soldier agents and a 51-action space within a time-quantized, multi-modal sandbox. Key contributions include anonymizing battle data to reduce priors, recording historical action trajectories, and evaluating multiple backbones (Claude-3, GPT-4, GPT-4V) against historical records to assess casualty trajectories and agent behavior. The work offers a tool for immersive historical analysis and education and points toward future enhancements such as expert systems, richer soldier modeling, and broader battle types.
Abstract
This paper presents BattleAgent, an emulation system that combines the Large Vision-Language Model and Multi-agent System. This novel system aims to simulate complex dynamic interactions among multiple agents, as well as between agents and their environments, over a period of time. It emulates both the decision-making processes of leaders and the viewpoints of ordinary participants, such as soldiers. The emulation showcases the current capabilities of agents, featuring fine-grained multi-modal interactions between agents and landscapes. It develops customizable agent structures to meet specific situational requirements, for example, a variety of battle-related activities like scouting and trench digging. These components collaborate to recreate historical events in a lively and comprehensive manner while offering insights into the thoughts and feelings of individuals from diverse viewpoints. The technological foundations of BattleAgent establish detailed and immersive settings for historical battles, enabling individual agents to partake in, observe, and dynamically respond to evolving battle scenarios. This methodology holds the potential to substantially deepen our understanding of historical events, particularly through individual accounts. Such initiatives can also aid historical research, as conventional historical narratives often lack documentation and prioritize the perspectives of decision-makers, thereby overlooking the experiences of ordinary individuals. BattelAgent illustrates AI's potential to revitalize the human aspect in crucial social events, thereby fostering a more nuanced collective understanding and driving the progressive development of human society.
