Table of Contents
Fetching ...

Generative Modeling of Individual Behavior at Scale

Nabil Omi, Lucas Caccia, Anurag Sarkar, Jordan T. Ash, Siddhartha Sen

TL;DR

The paper tackles the challenge of modeling human behavior at the individual level at scale. It introduces a multitask framework using parameter-efficient fine-tuning with modular adapters (LoRA) and a routing mechanism to learn per-player style vectors that generate actions in each player's style. The approach yields scalable behavioral stylometry, competitive per-player move generation, and the ability to interpolate and steer new styles, with demonstrated results in chess, Rocket League, and even image generation. This work enables personalized AI partners and interpretable, controllable agent behavior, with broad applicability beyond gaming to diffusion-based image editing. The findings show strong stylometry accuracy, efficient per-player generation, and a generalizable methodology for learning and manipulating individual behavior representations.

Abstract

There has been a growing interest in using AI to model human behavior, particularly in domains where humans interact with this technology. While most existing work models human behavior at an aggregate level, our goal is to model behavior at the individual level. Recent approaches to behavioral stylometry -- or the task of identifying a person from their actions alone -- have shown promise in domains like chess, but these approaches are either not scalable (e.g., fine-tune a separate model for each person) or not generative, in that they cannot generate actions. We address these limitations by framing behavioral stylometry as a multi-task learning problem -- where each task represents a distinct person -- and use parameter-efficient fine-tuning (PEFT) methods to learn an explicit style vector for each person. Style vectors are generative: they selectively activate shared "skill" parameters to generate actions in the style of each person. They also induce a latent space that we can interpret and manipulate algorithmically. In particular, we develop a general technique for style steering that allows us to steer a player's style vector towards a desired property. We apply our approach to two very different games, at unprecedented scales: chess (47,864 players) and Rocket League (2,000 players). We also show generality beyond gaming by applying our method to image generation, where we learn style vectors for 10,177 celebrities and use these vectors to steer their images.

Generative Modeling of Individual Behavior at Scale

TL;DR

The paper tackles the challenge of modeling human behavior at the individual level at scale. It introduces a multitask framework using parameter-efficient fine-tuning with modular adapters (LoRA) and a routing mechanism to learn per-player style vectors that generate actions in each player's style. The approach yields scalable behavioral stylometry, competitive per-player move generation, and the ability to interpolate and steer new styles, with demonstrated results in chess, Rocket League, and even image generation. This work enables personalized AI partners and interpretable, controllable agent behavior, with broad applicability beyond gaming to diffusion-based image editing. The findings show strong stylometry accuracy, efficient per-player generation, and a generalizable methodology for learning and manipulating individual behavior representations.

Abstract

There has been a growing interest in using AI to model human behavior, particularly in domains where humans interact with this technology. While most existing work models human behavior at an aggregate level, our goal is to model behavior at the individual level. Recent approaches to behavioral stylometry -- or the task of identifying a person from their actions alone -- have shown promise in domains like chess, but these approaches are either not scalable (e.g., fine-tune a separate model for each person) or not generative, in that they cannot generate actions. We address these limitations by framing behavioral stylometry as a multi-task learning problem -- where each task represents a distinct person -- and use parameter-efficient fine-tuning (PEFT) methods to learn an explicit style vector for each person. Style vectors are generative: they selectively activate shared "skill" parameters to generate actions in the style of each person. They also induce a latent space that we can interpret and manipulate algorithmically. In particular, we develop a general technique for style steering that allows us to steer a player's style vector towards a desired property. We apply our approach to two very different games, at unprecedented scales: chess (47,864 players) and Rocket League (2,000 players). We also show generality beyond gaming by applying our method to image generation, where we learn style vectors for 10,177 celebrities and use these vectors to steer their images.

Paper Structure

This paper contains 34 sections, 4 equations, 13 figures, 1 table, 1 algorithm.

Figures (13)

  • Figure 1: (left) Our overall architecture. We augment a base model with a set of MHR adapters and a routing matrix composed of each player's style vector. (right) Detailed view of an MHR layer, showing a skill inventory of adapters shared across players. The player's style vector specifies which skills are active (in this case, the first and third) to generate the final low-rank weight shift that is applied to the (frozen) base model layer.
  • Figure 2: Accuracy at various game counts of the individual models (Maia) and our method (MHR-Maia). MHR-Maia is within 1% accuracy of individual model fine-tuning using roughly 1% of the compute cost per player.
  • Figure 3: The distribution over cosine similarity between style vectors learned from different partitions of the same player (red) vs across all players (blue). A pair of style vectors learned from non-overlapping portions of a single player's data are far more similar to each other than those learned from distinct players.
  • Figure 4: Comparing different player styles using human-interpretable evaluation metrics.
  • Figure 5: The style of an intermediate player (green) is shown along with the two component players (blue and red).
  • ...and 8 more figures