Table of Contents
Fetching ...

Evaluating Agents using Social Choice Theory

Marc Lanctot, Kate Larson, Yoram Bachrach, Luke Marris, Zun Li, Avishkar Bhoopchand, Thomas Anthony, Brian Tanner, Anna Koop

TL;DR

This work reframes the evaluation of general AI agents as a social-choice problem, treating each task as a voter and using a social-welfare function to derive a global ranking without cross-task score normalization. It introduces Voting-as-Evaluation (VasE) and emphasizes Maximal Lotteries as a principled, tractable method with strong axiomatic guarantees, alongside Iterative Maximal Lotteries (IML) for full ranking and cycle discovery. Through empirical studies across reinforcement learning, large language models, and human Diplomacy data, VasE demonstrates robustness, greater interpretability, and favorable generalization compared to Elo and Nash averaging. The results highlight the framework’s flexibility, enabling task-weighted evaluations and revealing latent structure such as game-theoretic cycles, with promising directions for uncertainty handling and AvA extensions.

Abstract

We argue that many general evaluation problems can be viewed through the lens of voting theory. Each task is interpreted as a separate voter, which requires only ordinal rankings or pairwise comparisons of agents to produce an overall evaluation. By viewing the aggregator as a social welfare function, we are able to leverage centuries of research in social choice theory to derive principled evaluation frameworks with axiomatic foundations. These evaluations are interpretable and flexible, while avoiding many of the problems currently facing cross-task evaluation. We apply this Voting-as-Evaluation (VasE) framework across multiple settings, including reinforcement learning, large language models, and humans. In practice, we observe that VasE can be more robust than popular evaluation frameworks (Elo and Nash averaging), discovers properties in the evaluation data not evident from scores alone, and can predict outcomes better than Elo in a complex seven-player game. We identify one particular approach, maximal lotteries, that satisfies important consistency properties relevant to evaluation, is computationally efficient (polynomial in the size of the evaluation data), and identifies game-theoretic cycles.

Evaluating Agents using Social Choice Theory

TL;DR

This work reframes the evaluation of general AI agents as a social-choice problem, treating each task as a voter and using a social-welfare function to derive a global ranking without cross-task score normalization. It introduces Voting-as-Evaluation (VasE) and emphasizes Maximal Lotteries as a principled, tractable method with strong axiomatic guarantees, alongside Iterative Maximal Lotteries (IML) for full ranking and cycle discovery. Through empirical studies across reinforcement learning, large language models, and human Diplomacy data, VasE demonstrates robustness, greater interpretability, and favorable generalization compared to Elo and Nash averaging. The results highlight the framework’s flexibility, enabling task-weighted evaluations and revealing latent structure such as game-theoretic cycles, with promising directions for uncertainty handling and AvA extensions.

Abstract

We argue that many general evaluation problems can be viewed through the lens of voting theory. Each task is interpreted as a separate voter, which requires only ordinal rankings or pairwise comparisons of agents to produce an overall evaluation. By viewing the aggregator as a social welfare function, we are able to leverage centuries of research in social choice theory to derive principled evaluation frameworks with axiomatic foundations. These evaluations are interpretable and flexible, while avoiding many of the problems currently facing cross-task evaluation. We apply this Voting-as-Evaluation (VasE) framework across multiple settings, including reinforcement learning, large language models, and humans. In practice, we observe that VasE can be more robust than popular evaluation frameworks (Elo and Nash averaging), discovers properties in the evaluation data not evident from scores alone, and can predict outcomes better than Elo in a complex seven-player game. We identify one particular approach, maximal lotteries, that satisfies important consistency properties relevant to evaluation, is computationally efficient (polynomial in the size of the evaluation data), and identifies game-theoretic cycles.
Paper Structure (44 sections, 3 theorems, 29 equations, 17 figures, 23 tables, 2 algorithms)

This paper contains 44 sections, 3 theorems, 29 equations, 17 figures, 23 tables, 2 algorithms.

Key Result

Lemma 1

In a two-player, symmetric, zero-sum game, the maximum entropy Nash equilibrium (MENE) results in unique and equal mixed strategies for both players.

Figures (17)

  • Figure 1: An evaluation problem for general agents. Each event has its own separate metric for determining the ranking of each participant, with gold, silver, and bronze corresponding to first, second, and third place ranks.
  • Figure 2: Example effect of clones on Elo predictions. (Top) True win frequencies of a transitive relationship: $A$ beats $B$ beats $C$. (Bottom) Worst-case error of true $p_{AC}$ from Elo's predictions $\hat{p}_{AC}$ with clones of $B$.
  • Figure 3: Head-to-head win rates for agents in the pentathlon from Figure \ref{['fig:meeple_pentathlon']}. Given these win rates, Elo assigns A and C the same rating.
  • Figure 4: (Left) The voter preference matrix $N(x, y)$ shows the number of events (votes) in which the agent on row $x$ is preferred to the agent on column $y$, for the example in Figure \ref{['fig:meeple_pentathlon']}. (Right) The voter margin matrix whose entries are $M(x,y) = \delta(x,y) = N(x,y) - N(y,x)$.
  • Figure 5: Average ranking error across 50 splits. Error bars represent 95% confidence intervals.
  • ...and 12 more figures

Theorems & Definitions (10)

  • Lemma 1: Symmetric MENE
  • proof
  • Lemma 2: Entropy of Product of Marginals
  • Theorem 1: Nash Average AvT Equivalence
  • proof
  • Definition 1: Condorcet Winner deCondorcet1785
  • Definition 2: Condorcet Consistency
  • Definition 3: Population Consistency smith1973aggregation
  • Definition 4: Clone consistency Tideman87
  • Definition 5: Agenda Consistency