Table of Contents
Fetching ...

Strategic Candidacy in Generative AI Arenas

Chris Hays, Rachel Li, Bailey Flanigan, Manish Raghavan

Abstract

AI arenas, which rank generative models from pairwise preferences of users, are a popular method for measuring the relative performance of models in the course of their organic use. Because rankings are computed from noisy preferences, there is a concern that model producers can exploit this randomness by submitting many models (e.g., multiple variants of essentially the same model) and thereby artificially improve the rank of their top models. This can lead to degradations in the quality, and therefore the usefulness, of the ranking. In this paper, we begin by establishing, both theoretically and in simulations calibrated to data from the platform Arena (formerly LMArena, Chatbot Arena), conditions under which producers can benefit from submitting clones when their goal is to be ranked highly. We then propose a new mechanism for ranking models from pairwise comparisons, called You-Rank-We-Rank (YRWR). It requires that producers submit rankings over their own models and uses these rankings to correct statistical estimates of model quality. We prove that this mechanism is approximately clone-robust, in the sense that a producer cannot improve their rank much by doing anything other than submitting each of their unique models exactly once. Moreover, to the extent that model producers are able to correctly rank their own models, YRWR improves overall ranking accuracy. In further simulations, we show that indeed the mechanism is approximately clone-robust and quantify improvements to ranking accuracy, even under producer misranking.

Strategic Candidacy in Generative AI Arenas

Abstract

AI arenas, which rank generative models from pairwise preferences of users, are a popular method for measuring the relative performance of models in the course of their organic use. Because rankings are computed from noisy preferences, there is a concern that model producers can exploit this randomness by submitting many models (e.g., multiple variants of essentially the same model) and thereby artificially improve the rank of their top models. This can lead to degradations in the quality, and therefore the usefulness, of the ranking. In this paper, we begin by establishing, both theoretically and in simulations calibrated to data from the platform Arena (formerly LMArena, Chatbot Arena), conditions under which producers can benefit from submitting clones when their goal is to be ranked highly. We then propose a new mechanism for ranking models from pairwise comparisons, called You-Rank-We-Rank (YRWR). It requires that producers submit rankings over their own models and uses these rankings to correct statistical estimates of model quality. We prove that this mechanism is approximately clone-robust, in the sense that a producer cannot improve their rank much by doing anything other than submitting each of their unique models exactly once. Moreover, to the extent that model producers are able to correctly rank their own models, YRWR improves overall ranking accuracy. In further simulations, we show that indeed the mechanism is approximately clone-robust and quantify improvements to ranking accuracy, even under producer misranking.

Paper Structure

This paper contains 30 sections, 21 theorems, 144 equations, 5 figures, 2 algorithms.

Key Result

Theorem 3.2

For all constants $\varepsilon, \delta > 0$, there exists $s_0, m_0$ such that for all $s \geq s_0, m \geq m_0$, the following holds. For any producer $i$, any strategy profiles $z$ and any $(\varepsilon, \delta)$-competitive model $j$, producer $i$ would benefit from submitting an additional copy o

Figures (5)

  • Figure 1: The Status Quo (sq) mechanism (top half) and the You-Rank-We-Rank (yrwr) mechanism (bottom half).
  • Figure 2: Ranks gained via cloning under the Status Quo versus You-Rank-We-Rank mechanisms, across several of Arena's model arenas.
  • Figure 3: Difference in Kendall-Tau distance to the ground truth under the Status Quo versus You-Rank-We-Rank mechanisms, across Arena's various arenas. Difference greater than 0 implies that the YRWR mechanism is closer to the true ranking.
  • Figure 4: Rank difference between submitting one clone and no clones under the sq mechanism.
  • Figure 5: Rank difference between submitting one clone and no clones under the yrwr mechanism.

Theorems & Definitions (36)

  • Definition 3.1: $(\varepsilon, \delta)$-competitive model
  • Theorem 3.2: Clone-nonrobustness of the status quo mechanism
  • Example 3.3: Constant possible gain
  • Theorem 4.1: Approximate cloneproofness
  • Proposition 4.2: yrwr is accuracy-improving
  • Corollary 4.3: Efficiency and correctness of yrwr
  • Example 4.4
  • Proposition 4.5: Asymptotic truthfulness
  • Corollary 4.5: Approximate cloneproofness of ua-
  • Proposition 4.6: Efficiency and correctness of ua-yrwr
  • ...and 26 more