Distributionally Robust Performative Prediction
Songkai Xue, Yuekai Sun
TL;DR
The paper addresses performative prediction where deployed models alter the data distribution by introducing distributionally robust performative prediction (DRPO). It defines the distributionally robust performative risk (DRPR) using a KL-divergence-based uncertainty set around the nominal map and shows strong duality that reduces DRPR to a single-variable optimization, enabling efficient algorithms. DRPO provides guarantees that, when the true distribution map lies within the uncertainty set, PR_true(\theta) is bounded by DRPR(\theta), and yields favorable excess-risk properties relative to PO, especially under misspecification. The authors develop alternating-minimization and tilted-risk algorithms, propose calibration schemes for the uncertainty radius, and demonstrate through experiments that DRPO can outperform PO in worst-case and fairness-related metrics, while offering a principled approach to robustness in performative learning.
Abstract
Performative prediction aims to model scenarios where predictive outcomes subsequently influence the very systems they target. The pursuit of a performative optimum (PO) -- minimizing performative risk -- is generally reliant on modeling of the distribution map, which characterizes how a deployed ML model alters the data distribution. Unfortunately, inevitable misspecification of the distribution map can lead to a poor approximation of the true PO. To address this issue, we introduce a novel framework of distributionally robust performative prediction and study a new solution concept termed as distributionally robust performative optimum (DRPO). We show provable guarantees for DRPO as a robust approximation to the true PO when the nominal distribution map is different from the actual one. Moreover, distributionally robust performative prediction can be reformulated as an augmented performative prediction problem, enabling efficient optimization. The experimental results demonstrate that DRPO offers potential advantages over traditional PO approach when the distribution map is misspecified at either micro- or macro-level.
