Table of Contents
Fetching ...

Optimal classification with endogenous behavior

Elizabeth Maggie Penn

TL;DR

The paper addresses classification when individuals adjust their behavior in response to the classifier itself (outcome performativity). It develops a Stackelberg-style model with two actions $\beta_i\in\{0,1\}$, private cost $\gamma_i$ drawn from $H$, and signal densities $f_0$ and $f_1$ (with $f_1$ satisfying the strict monotone likelihood ratio property relative to $f_0$); a classifier $\delta(x)$ maps signals to $\Pr[d_i=1]$, and the key quantity $\Delta_\delta=\int (f_1(x)-f_0(x))\delta(x) dx$ governs the optimal rule. The main result proves that the optimal classifier is a threshold or a negative-threshold rule, characterized by $\tau_C$ (the crossing point of $f_1$ and $f_0$) and, if necessary, a pair of thresholds $\tau_L<\tau_C<\tau_H$; an explicit example shows a negative-threshold rule can achieve higher accuracy by inducing more non-compliance, illustrating how downward shifts in base rates can improve performance. The analysis extends to more general objective functions that weigh true/false positives/negatives, and a corollary shows that, under accuracy-aligned or accuracy-misaligned objectives, threshold or negative-threshold rules remain optimal. The paper concludes by highlighting ethical implications: optimal classification can correlate negatively with signal information, create fairness and trust concerns, and require auditing when outcomes affect behavior in endogenous settings.

Abstract

I consider the problem of classifying individual behavior in a simple setting of outcome performativity where the behavior the algorithm seeks to classify is itself dependent on the algorithm. I show in this context that the most accurate classifier is either a threshold or a negative threshold rule. A threshold rule offers the "good" classification to those individuals more likely to have engaged in a desirable behavior, while a negative threshold rule offers the "good" outcome to those less likely to have engaged in the desirable behavior. While seemingly pathological, I show that a negative threshold rule can maximize classification accuracy when behavior is endogenous. I provide an example of such a classifier and extend the analysis to more general algorithm objectives. A key takeaway is that when behavior is endogenous to classification, optimal classification can negatively correlate with signal information. This may yield negative downstream effects on groups in terms of the aggregate behavior induced by an algorithm.

Optimal classification with endogenous behavior

TL;DR

The paper addresses classification when individuals adjust their behavior in response to the classifier itself (outcome performativity). It develops a Stackelberg-style model with two actions , private cost drawn from , and signal densities and (with satisfying the strict monotone likelihood ratio property relative to ); a classifier maps signals to , and the key quantity governs the optimal rule. The main result proves that the optimal classifier is a threshold or a negative-threshold rule, characterized by (the crossing point of and ) and, if necessary, a pair of thresholds ; an explicit example shows a negative-threshold rule can achieve higher accuracy by inducing more non-compliance, illustrating how downward shifts in base rates can improve performance. The analysis extends to more general objective functions that weigh true/false positives/negatives, and a corollary shows that, under accuracy-aligned or accuracy-misaligned objectives, threshold or negative-threshold rules remain optimal. The paper concludes by highlighting ethical implications: optimal classification can correlate negatively with signal information, create fairness and trust concerns, and require auditing when outcomes affect behavior in endogenous settings.

Abstract

I consider the problem of classifying individual behavior in a simple setting of outcome performativity where the behavior the algorithm seeks to classify is itself dependent on the algorithm. I show in this context that the most accurate classifier is either a threshold or a negative threshold rule. A threshold rule offers the "good" classification to those individuals more likely to have engaged in a desirable behavior, while a negative threshold rule offers the "good" outcome to those less likely to have engaged in the desirable behavior. While seemingly pathological, I show that a negative threshold rule can maximize classification accuracy when behavior is endogenous. I provide an example of such a classifier and extend the analysis to more general algorithm objectives. A key takeaway is that when behavior is endogenous to classification, optimal classification can negatively correlate with signal information. This may yield negative downstream effects on groups in terms of the aggregate behavior induced by an algorithm.

Paper Structure

This paper contains 2 sections, 2 theorems, 50 equations, 1 table.

Table of Contents

  1. Introduction
  2. Conclusion

Key Result

Theorem 2

Threshold or negative threshold rules are optimally accurate for classification with performativity. Specifically:

Theorems & Definitions (3)

  • Theorem 2
  • Definition 5
  • Corollary 6