Table of Contents
Fetching ...

FAIRGAME: a Framework for AI Agents Bias Recognition using Game Theory

Alessio Buscemi, Daniele Proverbio, Alessandro Di Stefano, The Anh Han, German Castignani, Pietro Liò

TL;DR

FAIRGAME addresses the challenge of predicting and interpreting AI-agent interactions in multi-agent settings by providing a reproducible, prompt-driven framework that integrates LLM-powered agents with classical game-theoretic models. It uses a JSON-based configuration and multilingual prompt templates to simulate games such as Prisoner's Dilemma and Battle of the Sexes across multiple LLMs and languages, enabling direct comparison with game-theoretic predictions. The results reveal language- and personality-linked biases and deviations from equilibrium behavior, and the authors propose a four-metric scoring system to quantify variability, inconsistency, and payoff sensitivity. FAIRGAME enables systematic bias discovery, guides LLM selection for strategic AI tasks, and supports extensions to incomplete-information, sequential-move, and real-world negotiation and conflict-resolution scenarios.

Abstract

Letting AI agents interact in multi-agent applications adds a layer of complexity to the interpretability and prediction of AI outcomes, with profound implications for their trustworthy adoption in research and society. Game theory offers powerful models to capture and interpret strategic interaction among agents, but requires the support of reproducible, standardized and user-friendly IT frameworks to enable comparison and interpretation of results. To this end, we present FAIRGAME, a Framework for AI Agents Bias Recognition using Game Theory. We describe its implementation and usage, and we employ it to uncover biased outcomes in popular games among AI agents, depending on the employed Large Language Model (LLM) and used language, as well as on the personality trait or strategic knowledge of the agents. Overall, FAIRGAME allows users to reliably and easily simulate their desired games and scenarios and compare the results across simulation campaigns and with game-theoretic predictions, enabling the systematic discovery of biases, the anticipation of emerging behavior out of strategic interplays, and empowering further research into strategic decision-making using LLM agents.

FAIRGAME: a Framework for AI Agents Bias Recognition using Game Theory

TL;DR

FAIRGAME addresses the challenge of predicting and interpreting AI-agent interactions in multi-agent settings by providing a reproducible, prompt-driven framework that integrates LLM-powered agents with classical game-theoretic models. It uses a JSON-based configuration and multilingual prompt templates to simulate games such as Prisoner's Dilemma and Battle of the Sexes across multiple LLMs and languages, enabling direct comparison with game-theoretic predictions. The results reveal language- and personality-linked biases and deviations from equilibrium behavior, and the authors propose a four-metric scoring system to quantify variability, inconsistency, and payoff sensitivity. FAIRGAME enables systematic bias discovery, guides LLM selection for strategic AI tasks, and supports extensions to incomplete-information, sequential-move, and real-world negotiation and conflict-resolution scenarios.

Abstract

Letting AI agents interact in multi-agent applications adds a layer of complexity to the interpretability and prediction of AI outcomes, with profound implications for their trustworthy adoption in research and society. Game theory offers powerful models to capture and interpret strategic interaction among agents, but requires the support of reproducible, standardized and user-friendly IT frameworks to enable comparison and interpretation of results. To this end, we present FAIRGAME, a Framework for AI Agents Bias Recognition using Game Theory. We describe its implementation and usage, and we employ it to uncover biased outcomes in popular games among AI agents, depending on the employed Large Language Model (LLM) and used language, as well as on the personality trait or strategic knowledge of the agents. Overall, FAIRGAME allows users to reliably and easily simulate their desired games and scenarios and compare the results across simulation campaigns and with game-theoretic predictions, enabling the systematic discovery of biases, the anticipation of emerging behavior out of strategic interplays, and empowering further research into strategic decision-making using LLM agents.

Paper Structure

This paper contains 12 sections, 6 figures, 3 tables, 3 algorithms.

Figures (6)

  • Figure 1: Schematic representation of FAIRGAME flow of document dependencies and outputs.
  • Figure 2: Prisoner's Dilemma: aggregated final scores of the repeated games over repeated experiments, over all three versions, for each LLM, language, combination of personalities and knowledge of opponent's personality.
  • Figure 3: Average trajectory of strategy choices across repeated rounds in all Prisoner’s Dilemma experiments, presented for each LLM and game variant. A value of 1 denotes selection of Option A, which corresponds to defection in this game, while -1 represents Option B (cooperation).
  • Figure 4: Battle of sexes: aggregated final scores of the repeated games and repeated experiments, over all three versions, for each LLM, language, combination of personalities and knowledge of opponent's personality. Cross language comparison for the conventional configuration.
  • Figure 5: Average evolution of coordination in strategy choices across repeated rounds for all experiments conducted in the Battle of the Sexes, shown separately for each LLM. As this is a coordination game, the plot examines whether the two LLMs select the same option in each round. A value of 1 indicates a mismatch in strategies (one selects Option A, the other Option B), reflecting coordination failure or defective behavior, while -1 indicates alignment in choices, reflecting successful coordination or cooperative behavior.
  • ...and 1 more figures