Table of Contents
Fetching ...

Distributionally Robust Performative Prediction

Songkai Xue, Yuekai Sun

TL;DR

The paper addresses performative prediction where deployed models alter the data distribution by introducing distributionally robust performative prediction (DRPO). It defines the distributionally robust performative risk (DRPR) using a KL-divergence-based uncertainty set around the nominal map and shows strong duality that reduces DRPR to a single-variable optimization, enabling efficient algorithms. DRPO provides guarantees that, when the true distribution map lies within the uncertainty set, PR_true(\theta) is bounded by DRPR(\theta), and yields favorable excess-risk properties relative to PO, especially under misspecification. The authors develop alternating-minimization and tilted-risk algorithms, propose calibration schemes for the uncertainty radius, and demonstrate through experiments that DRPO can outperform PO in worst-case and fairness-related metrics, while offering a principled approach to robustness in performative learning.

Abstract

Performative prediction aims to model scenarios where predictive outcomes subsequently influence the very systems they target. The pursuit of a performative optimum (PO) -- minimizing performative risk -- is generally reliant on modeling of the distribution map, which characterizes how a deployed ML model alters the data distribution. Unfortunately, inevitable misspecification of the distribution map can lead to a poor approximation of the true PO. To address this issue, we introduce a novel framework of distributionally robust performative prediction and study a new solution concept termed as distributionally robust performative optimum (DRPO). We show provable guarantees for DRPO as a robust approximation to the true PO when the nominal distribution map is different from the actual one. Moreover, distributionally robust performative prediction can be reformulated as an augmented performative prediction problem, enabling efficient optimization. The experimental results demonstrate that DRPO offers potential advantages over traditional PO approach when the distribution map is misspecified at either micro- or macro-level.

Distributionally Robust Performative Prediction

TL;DR

The paper addresses performative prediction where deployed models alter the data distribution by introducing distributionally robust performative prediction (DRPO). It defines the distributionally robust performative risk (DRPR) using a KL-divergence-based uncertainty set around the nominal map and shows strong duality that reduces DRPR to a single-variable optimization, enabling efficient algorithms. DRPO provides guarantees that, when the true distribution map lies within the uncertainty set, PR_true(\theta) is bounded by DRPR(\theta), and yields favorable excess-risk properties relative to PO, especially under misspecification. The authors develop alternating-minimization and tilted-risk algorithms, propose calibration schemes for the uncertainty radius, and demonstrate through experiments that DRPO can outperform PO in worst-case and fairness-related metrics, while offering a principled approach to robustness in performative learning.

Abstract

Performative prediction aims to model scenarios where predictive outcomes subsequently influence the very systems they target. The pursuit of a performative optimum (PO) -- minimizing performative risk -- is generally reliant on modeling of the distribution map, which characterizes how a deployed ML model alters the data distribution. Unfortunately, inevitable misspecification of the distribution map can lead to a poor approximation of the true PO. To address this issue, we introduce a novel framework of distributionally robust performative prediction and study a new solution concept termed as distributionally robust performative optimum (DRPO). We show provable guarantees for DRPO as a robust approximation to the true PO when the nominal distribution map is different from the actual one. Moreover, distributionally robust performative prediction can be reformulated as an augmented performative prediction problem, enabling efficient optimization. The experimental results demonstrate that DRPO offers potential advantages over traditional PO approach when the distribution map is misspecified at either micro- or macro-level.

Paper Structure

This paper contains 35 sections, 7 theorems, 54 equations, 9 figures, 3 algorithms.

Key Result

Proposition 2.6

Suppose that the uncertainty collection $\mathcal{U}$ contains the true distribution map $\mathcal{D}_{\operatorname{true}}$, then the true performative risk is bounded by the distribution robust performative risk: $\operatorname{PR}_{\operatorname{true}}(\theta) \leq \operatorname{DRPR}(\theta)$ fo

Figures (9)

  • Figure 1: Results of Experiment \ref{['sec:exp1']}. Left: performative risk incurred by the PO and the DRPO's with various radius $\rho$'s. Middle: relative improvement in worst-case performative risk of the DRPO to the PO as the radius $\rho$ increases, for different range of misspecification $\eta$'s. Right: radius $\rho$ versus estimated KL divergence between $\mathcal{D}_{\operatorname{true}}(\theta_{\operatorname{DRPO}}(\rho))$ and $\mathcal{D}(\theta_{\operatorname{DRPO}}(\rho))$, where vertical bands indicate the calibrated radius $\rho_{\operatorname{cal}}$'s.
  • Figure 2: Results of Experiment \ref{['sec:exp2']}. Left: performative risk incurred by the PO and the TPO's with various tilt $\alpha$'s. Middle: relative improvement in worst-case performative risk of the TPO to the PO as the tilt $\alpha$ increases, for different range of misspecification $\eta$'s. Right: the correspondence relationship between the radius $\rho$ and the (inverse of) optimal dual variable $\mu^\star$.
  • Figure 3: Results of Experiment \ref{['sec:exp3']}. Performative risk of the population, the majority, and the minority, as the tilt $\alpha$ increases. The vertical band indicates the calibrated tilt $\alpha_{\operatorname{cal}}$'s.
  • Figure 4: Histogram of performative loss under Experiment \ref{['sec:exp1']} with ${\epsilon}_{\operatorname{true}} = 0.5$. Left: histogram for the PO. Middle: histogram for the DRPO with $\rho = 0.02$. Right: hitogram for the DRPO with $\rho = 0.04$.
  • Figure 5: Additional results of Experiment \ref{['sec:exp1']}. Left: performative balanced error rate incurred by the PO and the DRPO's with various radius $\rho$'s. Right: relative improvement in worst-case performative balanced error rate of the DRPO to the PO as the radius $\rho$ increases, for different range of misspecification $\eta$'s.
  • ...and 4 more figures

Theorems & Definitions (22)

  • Example 2.1: Strategic classification
  • Example 2.2: Location family
  • Example 2.3: Disparate impacts and fairness
  • Definition 2.4: Distributionally robust performative risk
  • Definition 2.5: Distributionally robust performative optimum
  • Proposition 2.6: Generalization principle of distributionally robust performative prediction
  • Proposition 3.1: Strong duality of DRPR
  • Proposition 3.2: Excess risk bound of the PO
  • Proposition 3.3: Excess risk bound of the DRPO
  • proof
  • ...and 12 more