Performative Prediction: Past and Future

Moritz Hardt; Celestine Mendler-Dünner

Performative Prediction: Past and Future

Moritz Hardt, Celestine Mendler-Dünner

TL;DR

The paper formalizes performativity in machine learning by introducing a distribution map $\mathcal{D}(\theta)$ that captures how deployed predictions alter data, and it distinguishes between learning and steering through two equilibria: performative stability and performative optimality. It develops retraining dynamics (RRM, RGD, SGD) and establishes convergence under regularity conditions, while also addressing practical deployment strategies, stochastic updates, and dynamic benchmarks. Model-based approaches, power, and algorithmic collective action are analyzed, including predictions as mediators, strategic classification, parametric distribution maps, and a formal notion of performative power that connects to antitrust and competition considerations. The framework enables evaluating and optimizing the downstream social and economic impact of predictive systems, with implications for fairness, policy, and digital-market investigations, and highlights data feedback loops as a key practical challenge and opportunity.

Abstract

Predictions in the social world generally influence the target of prediction, a phenomenon known as performativity. Self-fulfilling and self-negating predictions are examples of performativity. Of fundamental importance to economics, finance, and the social sciences, the notion has been absent from the development of machine learning that builds on the static perspective of pattern recognition. In machine learning applications, however, performativity often surfaces as distribution shift. A predictive model deployed on a digital platform, for example, influences behavior and thereby changes the data-generating distribution. We discuss the recently founded area of performative prediction that provides a definition and conceptual framework to study performativity in machine learning. A key element of performative prediction is a natural equilibrium notion that gives rise to new optimization challenges. What emerges is a distinction between learning and steering, two mechanisms at play in performative prediction. Steering is in turn intimately related to questions of power in digital markets. The notion of performative power that we review gives an answer to the question how much a platform can steer participants through its predictions. We end on a discussion of future directions, such as the role that performativity plays in contesting algorithmic systems.

Performative Prediction: Past and Future

TL;DR

The paper formalizes performativity in machine learning by introducing a distribution map

that captures how deployed predictions alter data, and it distinguishes between learning and steering through two equilibria: performative stability and performative optimality. It develops retraining dynamics (RRM, RGD, SGD) and establishes convergence under regularity conditions, while also addressing practical deployment strategies, stochastic updates, and dynamic benchmarks. Model-based approaches, power, and algorithmic collective action are analyzed, including predictions as mediators, strategic classification, parametric distribution maps, and a formal notion of performative power that connects to antitrust and competition considerations. The framework enables evaluating and optimizing the downstream social and economic impact of predictive systems, with implications for fairness, policy, and digital-market investigations, and highlights data feedback loops as a key practical challenge and opportunity.

Abstract

Paper Structure (33 sections, 4 theorems, 41 equations, 4 figures)

This paper contains 33 sections, 4 theorems, 41 equations, 4 figures.

Introduction
Contributions and outline
Motivation: the GMS theorem
Performative prediction
Distribution map
Performative stability
Performative optimality
Rewriting the rules of prediction
Revisiting economic forecasting
Retraining under performativity
Repeated risk minimization
Gradient-based optimization
Stochastic Optimization
Practical considerations
Data feedback loops and dynamic benchmarks
...and 18 more sections

Key Result

Theorem 2

Suppose that the loss function $\ell(\theta,z)$ is $\gamma$-strongly convex in $\theta$ and $\beta$-smooth in $z$. Then, repeated retraining defined in Equation eq:RRM converges to a unique stable point as long as the sensitivity of the distribution map $\mathcal{D}(\cdot)$ satisfies $\epsilon< \fra

Figures (4)

Figure 1: Simon's argument for the existence of stable points.
Figure 2: Different deployment strategies in stochastic optimization for performative prediction. Left: Greedy deploy publishes the model after every stochastic update. Right: Lazy deploy processes multiple samples offline before releasing the updated model.
Figure 3: Confidence bounds on the performative risk. Left: using bandit feedback and Lipschitzness of the performative risk. Right: using performative feedback together with sensitivity of the distribution map and Lipschitzness of $\ell$ in $z$.
Figure :

Theorems & Definitions (9)

Definition 1: Sensitivity
Theorem 2: PZMH20
proof
Theorem 3: mendler20stochasticPP
proof : Proof Sketch
Definition 4: Performative Power
Definition 5: Causal effect of position
Theorem 6: hardt2022power
Theorem 7: hardt2023algorithmic

Performative Prediction: Past and Future

TL;DR

Abstract

Performative Prediction: Past and Future

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (9)