Table of Contents
Fetching ...

Tight Lower Bounds and Improved Convergence in Performative Prediction

Pedram Khorsandi, Rushil Gupta, Mehrnaz Mofakhami, Simon Lacoste-Julien, Gauthier Gidel

TL;DR

This work tackles performative prediction by analyzing how deployed models shift the data distribution and how to achieve rapid convergence to a performatively stable point. It extends Repeated Risk Minimization (RRM) by leveraging historical retraining snapshots, introducing Affine Risk Minimizers (ARM) that form convex combinations of past distributions to widen the solvable problem class. The authors derive a new upper bound for last-iterate methods under weaker conditions, prove tightness results for both the Perdomo 2020 and Mofakhami 2023 frameworks, and establish ARM-based lower bounds, complemented by empirical evidence from credit scoring and ride-sharing benchmarks showing faster convergence with historical data. Collectively, the paper demonstrates that utilizing past distributions can substantially accelerate convergence to stability in dynamic environments, with theoretical guarantees and practical validation.

Abstract

Performative prediction is a framework accounting for the shift in the data distribution induced by the prediction of a model deployed in the real world. Ensuring rapid convergence to a stable solution where the data distribution remains the same after the model deployment is crucial, especially in evolving environments. This paper extends the Repeated Risk Minimization (RRM) framework by utilizing historical datasets from previous retraining snapshots, yielding a class of algorithms that we call Affine Risk Minimizers and enabling convergence to a performatively stable point for a broader class of problems. We introduce a new upper bound for methods that use only the final iteration of the dataset and prove for the first time the tightness of both this new bound and the previous existing bounds within the same regime. We also prove that utilizing historical datasets can surpass the lower bound for last iterate RRM, and empirically observe faster convergence to the stable point on various performative prediction benchmarks. We offer at the same time the first lower bound analysis for RRM within the class of Affine Risk Minimizers, quantifying the potential improvements in convergence speed that could be achieved with other variants in our framework.

Tight Lower Bounds and Improved Convergence in Performative Prediction

TL;DR

This work tackles performative prediction by analyzing how deployed models shift the data distribution and how to achieve rapid convergence to a performatively stable point. It extends Repeated Risk Minimization (RRM) by leveraging historical retraining snapshots, introducing Affine Risk Minimizers (ARM) that form convex combinations of past distributions to widen the solvable problem class. The authors derive a new upper bound for last-iterate methods under weaker conditions, prove tightness results for both the Perdomo 2020 and Mofakhami 2023 frameworks, and establish ARM-based lower bounds, complemented by empirical evidence from credit scoring and ride-sharing benchmarks showing faster convergence with historical data. Collectively, the paper demonstrates that utilizing past distributions can substantially accelerate convergence to stability in dynamic environments, with theoretical guarantees and practical validation.

Abstract

Performative prediction is a framework accounting for the shift in the data distribution induced by the prediction of a model deployed in the real world. Ensuring rapid convergence to a stable solution where the data distribution remains the same after the model deployment is crucial, especially in evolving environments. This paper extends the Repeated Risk Minimization (RRM) framework by utilizing historical datasets from previous retraining snapshots, yielding a class of algorithms that we call Affine Risk Minimizers and enabling convergence to a performatively stable point for a broader class of problems. We introduce a new upper bound for methods that use only the final iteration of the dataset and prove for the first time the tightness of both this new bound and the previous existing bounds within the same regime. We also prove that utilizing historical datasets can surpass the lower bound for last iterate RRM, and empirically observe faster convergence to the stable point on various performative prediction benchmarks. We offer at the same time the first lower bound analysis for RRM within the class of Affine Risk Minimizers, quantifying the potential improvements in convergence speed that could be achieved with other variants in our framework.

Paper Structure

This paper contains 30 sections, 22 theorems, 192 equations, 7 figures.

Key Result

Theorem 1

Suppose the loss $\ell(f_{\theta}(x), y)$ is $\gamma$-strongly convex with respect to $f_{\theta}(x)$ (Aassumption:strong-convexity) and that the gradient norm with respect to $f_{\theta}(x)$ is bounded by $M = \sup_{x, y, \theta} \|\nabla_{\hat{y}} \ell(f_{\theta}(x), y)\|$ (Aassumption:bounded-gra By the Schauder fixed-point theorem, a stable classifier $f_{\theta_{\text{PS}}}$ exists, and if $\

Figures (7)

  • Figure 1: An example showing that using older snapshots (purple) speeds up convergence to the stable point (orange star) compared to only the latest snapshot (red). The implementation is provided in the code.
  • Figure 2: Convergence of $\|\theta^{t} - \theta_{\text{PS}}\|$ over iterations $t$ for different values of $\tau$, which defines the aggregation of datasets from training snapshots: $D_t = \sum_{i=t-\tau+1}^{t}\frac{1}{\tau}D(\theta_i)$. The dotted line shows our lower bound from pmlr-v119-perdomo20a, with $\epsilon = 2.49$, $\beta = 1$, and $\gamma = 5.0$. The experiment, consistent across all methods, validates the bound by showing that $\|\theta^{t} - \theta_{\text{PS}}\|$ does not drop below it, supporting our theory.
  • Figure 3: Loss shift due to performativity for the credit‑scoring environment. To accurately measure Performative Risk, we average over $500$ runs per method. Increasing the aggregation window $\tau$ ($1 \rightarrow 2 \rightarrow 4 \rightarrow t/2 \rightarrow \text{all}$) reduces the loss shifts and, consequently, reaches the stable point faster.
  • Figure 4: Log performative risk for the credit scoring environment across the RRM iterations. The numbers in the plot are averaged over 500 runs. Increasing the size of aggregation window $\tau$ from $1 \rightarrow 2 \rightarrow 4 \rightarrow t/2 \rightarrow all$ reduces the oscillations in the risk and converges to the same point. Note that the plot starts from iteration $5$ for better readability as the initial risk values were very high.
  • Figure : Loss shift due to performativity
  • ...and 2 more figures

Theorems & Definitions (22)

  • Theorem 1: RRM convergence modified Mofakhami's framework
  • Theorem 2: Tight lower bound modified Mofakhami's framework
  • Theorem 3: Tight lower bound Perdomo's framework
  • Lemma 1: 2-Snapshots ARM recurrence
  • Theorem 4: 2-Snapshots ARM convergence
  • Theorem 5: ARM lower bound modified Mofakhami's framework
  • Theorem 6: ARM lower bound Perdomo's framework
  • Theorem 7: Proximal ARM lower bound
  • Theorem 8
  • Lemma 2
  • ...and 12 more