Table of Contents
Fetching ...

On the Impact of Performative Risk Minimization for Binary Random Variables

Nikita Tsoy, Ivan Kirev, Negin Rahimiyazdi, Nikola Konstantinov

TL;DR

This work analyzes the broader impact of performative risk minimization (PRM) on binary outcomes under linear distribution shifts. It develops a sequential PRM framework, introduces two high-signal impact metrics—bias and mean shift—and derives explicit formulas for the PRM path under full information in one-period and infinite-horizon settings, distinguishing slow and rapid deployment regimes. It also provides imperfect-information estimators and RL simulations to validate theory, showing that PRM can induce nonzero bias and shift the mean toward extreme values, potentially amplifying distributional changes relative to standard RM. The findings illuminate how optimizing for performative accuracy can inadvertently reshape the data-generating process, with implications for drug efficacy and traffic prediction, and they lay groundwork for extending these insights to more complex distributions and multi-group scenarios.

Abstract

Performativity, the phenomenon where outcomes are influenced by predictions, is particularly prevalent in social contexts where individuals strategically respond to a deployed model. In order to preserve the high accuracy of machine learning models under distribution shifts caused by performativity, Perdomo et al. (2020) introduced the concept of performative risk minimization (PRM). While this framework ensures model accuracy, it overlooks the impact of the PRM on the underlying distributions and the predictions of the model. In this paper, we initiate the analysis of the impact of PRM, by studying performativity for a sequential performative risk minimization problem with binary random variables and linear performative shifts. We formulate two natural measures of impact. In the case of full information, where the distribution dynamics are known, we derive explicit formulas for the PRM solution and our impact measures. In the case of partial information, we provide performative-aware statistical estimators, as well as simulations. Our analysis contrasts PRM to alternatives that do not model data shift and indicates that PRM can have amplified side effects compared to such methods.

On the Impact of Performative Risk Minimization for Binary Random Variables

TL;DR

This work analyzes the broader impact of performative risk minimization (PRM) on binary outcomes under linear distribution shifts. It develops a sequential PRM framework, introduces two high-signal impact metrics—bias and mean shift—and derives explicit formulas for the PRM path under full information in one-period and infinite-horizon settings, distinguishing slow and rapid deployment regimes. It also provides imperfect-information estimators and RL simulations to validate theory, showing that PRM can induce nonzero bias and shift the mean toward extreme values, potentially amplifying distributional changes relative to standard RM. The findings illuminate how optimizing for performative accuracy can inadvertently reshape the data-generating process, with implications for drug efficacy and traffic prediction, and they lay groundwork for extending these insights to more complex distributions and multi-group scenarios.

Abstract

Performativity, the phenomenon where outcomes are influenced by predictions, is particularly prevalent in social contexts where individuals strategically respond to a deployed model. In order to preserve the high accuracy of machine learning models under distribution shifts caused by performativity, Perdomo et al. (2020) introduced the concept of performative risk minimization (PRM). While this framework ensures model accuracy, it overlooks the impact of the PRM on the underlying distributions and the predictions of the model. In this paper, we initiate the analysis of the impact of PRM, by studying performativity for a sequential performative risk minimization problem with binary random variables and linear performative shifts. We formulate two natural measures of impact. In the case of full information, where the distribution dynamics are known, we derive explicit formulas for the PRM solution and our impact measures. In the case of partial information, we provide performative-aware statistical estimators, as well as simulations. Our analysis contrasts PRM to alternatives that do not model data shift and indicates that PRM can have amplified side effects compared to such methods.

Paper Structure

This paper contains 89 sections, 9 theorems, 97 equations, 6 figures, 2 algorithms.

Key Result

Lemma 3.2

The mean squared error of $\theta_t$ on $D^{\text{test}}_t$ is

Figures (6)

  • Figure 1: The dependence of $\theta^*_0$ (blue), $p^*_1$ (orange), and $s_1$ (green) on $p_0$ for $\lambda = 0.8$ and $\pi = 0.2$ in slow $T=1$ case. Columns correspond to the different $\alpha$.
  • Figure 2: The dependence of $\mathtt{bias}(\hat{\theta}^*_0)$ (left) and $\mathtt{shift}(\hat{\theta}^*_0)$ (right) and corresponding variances on $p_0$. The upper row corresponds to $\alpha=0.3$, the lower row corresponds to $\alpha=-0.4$. Columns correspond to the different $m$.
  • Figure 3: The dependence of the differences in expected losses, $\mathop{\mathrm{\mathbb{E}}}\nolimits(\mathtt{loss}(\hat{\theta}^*_0) - \mathtt{loss}(\hat{\theta}^n_0))$, on $p_0$ and $\alpha$, for different $m$.
  • Figure 4: The predictions, $\theta_t$, (blue) the means, $p_t$, (orange) and their theoretical equilibrium values (red and green, respectively) in RL setting over episodes (left) or time (right) for $\pi = 0.2$, $\alpha = 0.15$, $\gamma = 0.9$, and $m=100$, where $m$ is the number of samples observed from test distribution at each step. The left and right plots correspond to the $T=1$ slow episodic setting (with $\lambda = 0$) and $T=\infty$ slow setting (with $\lambda = 0.3$), respectively.
  • Figure 5: The plots depict the dependence of $\theta^*_0$ (blue), $p^*_1$ (orange), $s_1$ (green), and $p^*_\infty$ (red) on $p_0$ for $\lambda = 0.8$, $\pi = 0.2$, and $\gamma=0.5$ in $T=\infty$ case. Columns correspond to the different $\alpha$, the top and bottom rows correspond to the slow and rapid cases, respectively.
  • ...and 1 more figures

Theorems & Definitions (13)

  • Remark 3.1
  • Lemma 3.2: Error-Uncertainty Tradeoff
  • Proposition 4.1: Proof in \ref{['sec:proof-one-slow-sol']}
  • Proposition 4.2: Proof in \ref{['sec:proof-two-rapid-sol']}
  • Theorem 4.3
  • Theorem 5.1: Proof in \ref{['sec:proof-inf-slow-sol']}
  • Theorem 5.2: Proof in \ref{['sec:proof-inf-rapid-sol']}
  • Proposition 1.1: Proof in \ref{['sec:proof-two-slow-sol']}
  • Theorem 1.2
  • proof
  • ...and 3 more