Table of Contents
Fetching ...

Who's Gaming the System? A Causally-Motivated Approach for Detecting Strategic Adaptation

Trenton Chang, Lindsay Warrenburg, Sae-Hwan Park, Ravi B. Parikh, Maggie Makar, Jenna Wiens

TL;DR

Agents may manipulate inputs to ML-guided payouts, prompting a need to rank those most prone to gaming. The paper introduces a gaming deterrence parameter $\\lambda_p$ and shows that while $\\lambda_p$ is only partially identifiable, a ranking over agents is recoverable via counterfactual causal effects $\\tau(p,p')$. In synthetic experiments, causal-effect estimators outperform noncausal baselines in identifying top offenders; in a Medicare case study, the inferred rankings correlate with the prevalence of for-profit providers, suggesting real-world auditing utility. The approach relies on shared rewards, convex costs, and rational-actor assumptions, offering a principled framework for targeted audits while acknowledging potential limitations and ethical considerations.

Abstract

In many settings, machine learning models may be used to inform decisions that impact individuals or entities who interact with the model. Such entities, or agents, may game model decisions by manipulating their inputs to the model to obtain better outcomes and maximize some utility. We consider a multi-agent setting where the goal is to identify the "worst offenders:" agents that are gaming most aggressively. However, identifying such agents is difficult without knowledge of their utility function. Thus, we introduce a framework in which each agent's tendency to game is parameterized via a scalar. We show that this gaming parameter is only partially identifiable. By recasting the problem as a causal effect estimation problem where different agents represent different "treatments," we prove that a ranking of all agents by their gaming parameters is identifiable. We present empirical results in a synthetic data study validating the usage of causal effect estimation for gaming detection and show in a case study of diagnosis coding behavior in the U.S. that our approach highlights features associated with gaming.

Who's Gaming the System? A Causally-Motivated Approach for Detecting Strategic Adaptation

TL;DR

Agents may manipulate inputs to ML-guided payouts, prompting a need to rank those most prone to gaming. The paper introduces a gaming deterrence parameter and shows that while is only partially identifiable, a ranking over agents is recoverable via counterfactual causal effects . In synthetic experiments, causal-effect estimators outperform noncausal baselines in identifying top offenders; in a Medicare case study, the inferred rankings correlate with the prevalence of for-profit providers, suggesting real-world auditing utility. The approach relies on shared rewards, convex costs, and rational-actor assumptions, offering a principled framework for targeted audits while acknowledging potential limitations and ethical considerations.

Abstract

In many settings, machine learning models may be used to inform decisions that impact individuals or entities who interact with the model. Such entities, or agents, may game model decisions by manipulating their inputs to the model to obtain better outcomes and maximize some utility. We consider a multi-agent setting where the goal is to identify the "worst offenders:" agents that are gaming most aggressively. However, identifying such agents is difficult without knowledge of their utility function. Thus, we introduce a framework in which each agent's tendency to game is parameterized via a scalar. We show that this gaming parameter is only partially identifiable. By recasting the problem as a causal effect estimation problem where different agents represent different "treatments," we prove that a ranking of all agents by their gaming parameters is identifiable. We present empirical results in a synthetic data study validating the usage of causal effect estimation for gaming detection and show in a case study of diagnosis coding behavior in the U.S. that our approach highlights features associated with gaming.

Paper Structure

This paper contains 55 sections, 9 theorems, 32 equations, 31 figures, 3 tables, 2 algorithms.

Key Result

Proposition 1

Define $R'(\cdot)$ as $\frac{dR}{dd^*_p}$ and $c'(\cdot)$ as $\frac{dc}{dd^*_p}$. For any agent $p$, given Assumptions assump:shared- assump:diminishing and an observed $\Delta_p(d^*_p)$, and the bound is sharp.

Figures (31)

  • Figure 1: Left: Two agents with gaming deterrence parameters $\lambda_1=30$ (purple) and $\lambda_2 = 50$ (blue) maximize utility (reward $R$ - cost $c$) with respect to diagnosis rate. Gaming costs increase in $\lambda_{(\cdot)}$, and lower an agent's optimal diagnosis rate (stars). Center: Agents' observed decisions reflect utility-maximizing behavior. Right: A decision-maker computes a payout based on agent decisions.
  • Figure 2: Left: Toy dataset with observed factual outcomes $d_i(p)$ and $d_i(p')$. "?" denotes missing counterfactual outcomes. Right: Causal graph for gaming detection with confounders $\mathbf{x}$, agent indicator $p$, ground truth diagnosis $d^*$, and agent decision $d$.
  • Figure 3: Causally-motivated gaming detection. Left: First, we impute counterfactual decisions for each agent. Middle: The imputed counterfactuals yield average treatment effects (ATEs) across pairs of agents. Right: Using ATE estimates to rank agents yields a ranking of the gaming parameter $\lambda_p$. We show one direction of comparison across agents for simplicity. In practice, we impute decisions for both directions of comparison and average (given blue agent's observations, impute purple agent's decisions).
  • Figure 4: a) Causally-motivated gaming detection.
  • Figure 5: Mean top-5 sensitivity (left) and DCG (center) across # of agents audited, and top-5 sensitivity with 7 audits (right) at mean range 0.9, with $\pm\sigma$ error. Causal methods improve over non-causal baselines. ${\triangledown}$: naïve baselines. ${\circ}$: anomaly detectors. ${\times}$: causal effect estimators.
  • ...and 26 more figures

Theorems & Definitions (17)

  • Remark 1: Gaming is utility-maximizing
  • Proposition 1
  • Theorem 1
  • Corollary 1
  • Proposition 2
  • Proposition
  • proof
  • Theorem
  • proof
  • Definition 1: $\varepsilon$-gaming
  • ...and 7 more