Toward Human-AI Alignment in Large-Scale Multi-Player Games

Sugandha Sharma; Guy Davidson; Khimya Khetarpal; Anssi Kanervisto; Udit Arora; Katja Hofmann; Ida Momennejad

Toward Human-AI Alignment in Large-Scale Multi-Player Games

Sugandha Sharma, Guy Davidson, Khimya Khetarpal, Anssi Kanervisto, Udit Arora, Katja Hofmann, Ida Momennejad

TL;DR

The paper tackles human-AI alignment in large-scale multiplayer games by introducing a Task-sets framework to derive an interpretable behavioral manifold with axes Fight-Flight, Explore-Exploit, and Solo-Multi-Agent. It analyzes ~100K Bleeding Edge games to extract human behavioral patterns and trains a ~222M-parameter Generative Pretrained Causal Transformer via behavior cloning, projecting both human and AI behaviors onto the same manifold for comparison. Findings show substantial human variability along the three axes, while the AI agent exhibits uniform, predominantly solo behavior, highlighting alignment gaps. The framework enables interpretable evaluation of alignment and offers a pathway to targeted agent design and broader application in human-centered AI development.

Abstract

Achieving human-AI alignment in complex multi-agent games is crucial for creating trustworthy AI agents that enhance gameplay. We propose a method to evaluate this alignment using an interpretable task-sets framework, focusing on high-level behavioral tasks instead of low-level policies. Our approach has three components. First, we analyze extensive human gameplay data from Xbox's Bleeding Edge (100K+ games), uncovering behavioral patterns in a complex task space. This task space serves as a basis set for a behavior manifold capturing interpretable axes: fight-flight, explore-exploit, and solo-multi-agent. Second, we train an AI agent to play Bleeding Edge using a Generative Pretrained Causal Transformer and measure its behavior. Third, we project human and AI gameplay to the proposed behavior manifold to compare and contrast. This allows us to interpret differences in policy as higher-level behavioral concepts, e.g., we find that while human players exhibit variability in fight-flight and explore-exploit behavior, AI players tend towards uniformity. Furthermore, AI agents predominantly engage in solo play, while humans often engage in cooperative and competitive multi-agent patterns. These stark differences underscore the need for interpretable evaluation, design, and integration of AI in human-aligned applications. Our study advances the alignment discussion in AI and especially generative AI research, offering a measurable framework for interpretable human-agent alignment in multiplayer gaming.

Toward Human-AI Alignment in Large-Scale Multi-Player Games

TL;DR

Abstract

Paper Structure (25 sections, 1 equation, 16 figures, 3 tables)

This paper contains 25 sections, 1 equation, 16 figures, 3 tables.

Introduction
Bleeding Edge
AI Agent
Behavioral analysis of gameplay data
Task-sets
Simultaneous affordance-completion analysis
Results
Fight-Flight
Explore-Exploit
Solo-Multi-agent play
Discussion
Appendix
Acknowledgements
Why Bleeding Edge
AI Agent details
...and 10 more sections

Figures (16)

Figure 1: Bleeding edge Power Collection game mode. (a) Analysis pipeline begins with task-sets used to extract the UMAP manifold embedding, interpreted to derive 3D human and AI behavioral manifold schematic. Humans highly vary in how they express fight-flight and explore-exploit behavior; they predominantly play in a multi-agent settings. AI agents exhibit low variability in fight-flight and explore-exploit behavior tending towards uniformity; they predominantly play solo. (b) Collection phase (left) and Deposit phase (right) in the Power Collection game mode. (c) Three character types (Support, Tank and Damage) with 13 possible characters in the game.
Figure 2: AI agent architecture. The architecture consists of a ResNet style encoder followed by a Causal Transformer. Model input is sequence of image observations ($O_i$), with sequence length $T$ and the model is trained to predict actions ($A_i$).
Figure 3: Cognitive themes used for behavioral analysis. (a) Ubiquitous cognitive themes in behaviors of various biological species used for analyzing gameplay dynamics of both human and AI agents. (b) Example task-sets defined for each of the cognitive themes. (c) Simultaneous affordance-completion curves for a subset of task-sets defined under the three cognitive themes.
Figure 4: Fight-Flight analysis results. (a) Simultaneous affordance-completion curves for two representative pairs of fight-flight task-sets from human gameplay data. (b) Top: An unsupervised 2D UMAP (Uniform Manifold Approximation and Projection) embedding of 123 human players averaged across 637 games. Each point represents one human player. Bottom: Human UMAP colored by reward. (c) Human UMAP colored by fight-flight behavior. (d) Simultaneous affordance-completion curves for a representative pair of fight-flight task-set from AI gameplay data. (e) Left: UMAP embeding of 116 AI players averaged across 116 games. Each point represents one AI player. Middle: AI UMAP colored by reward. Right: AI UMAP colored by fight-flight behavior.
Figure 5: Explore-Exploit analysis results. (a) Goal-directed navigation based task-set illustrations. (b) Simultaneous affordance-completion curves for a representative pair of explore-exploit task-set from human top and AI (bottom) gameplay data. (c) Top: An unsupervised 2D UMAP embedding of 123 human players averaged across 637 games. Each point represents one human player. Bottom: Human UMAP colored by reward. (d) Human UMAP colored by explore-exploit behavior. (e) Left: UMAP embeding of 116 AI players averaged across 116 games. Each point represents one AI player. Middle: AI UMAP colored by reward. Right: AI UMAP colored by explore-exploit behavior.
...and 11 more figures

Theorems & Definitions (1)

Definition 1: Task-Set

Toward Human-AI Alignment in Large-Scale Multi-Player Games

TL;DR

Abstract

Toward Human-AI Alignment in Large-Scale Multi-Player Games

Authors

TL;DR

Abstract

Table of Contents

Figures (16)

Theorems & Definitions (1)