Microfoundation Inference for Strategic Prediction
Daniele Bracale, Subha Maity, Felipe Maia Polo, Seamus Somerstep, Moulinath Banerjee, Yuekai Sun
TL;DR
This work tackles performative prediction by learning the microfoundations of agent responses to predictive models. It models agent behavior as a cost-adjusted utility maximization with a known benefit $B_\theta$ and an unknown cost $c$ restricted to a Bregman divergence $c_\varphi(z,z') = \varphi(z') - \varphi(z) - \nabla\varphi(z)^\top(z'-z)$, and aims to identify $\varphi$ from pre- and post-model distributions. The authors propose an optimal-transport-based estimator that aligns ex-ante and ex-post distributions to recover the gradient $\nabla\varphi$, enabling accurate reconstruction of the response map $T_\theta$ and robust minimization of performative risk. They establish identifiability conditions and convergence rates (e.g., $\mathbb{E}[\|\widehat{\gamma}-\gamma^*\|_2^2] \le K n^{-2/d}$ under strong convexity), and validate the approach on a credit-scoring dataset, showing robustness to misspecification of the benefit function and competitive performance relative to baselines. Overall, the framework provides a principled, data-driven path to infer social impacts of predictions and to exploit fast, constraint-aware optimization in strategic settings.
Abstract
Often in prediction tasks, the predictive model itself can influence the distribution of the target variable, a phenomenon termed performative prediction. Generally, this influence stems from strategic actions taken by stakeholders with a vested interest in predictive models. A key challenge that hinders the widespread adaptation of performative prediction in machine learning is that practitioners are generally unaware of the social impacts of their predictions. To address this gap, we propose a methodology for learning the distribution map that encapsulates the long-term impacts of predictive models on the population. Specifically, we model agents' responses as a cost-adjusted utility maximization problem and propose estimates for said cost. Our approach leverages optimal transport to align pre-model exposure (ex ante) and post-model exposure (ex post) distributions. We provide a rate of convergence for this proposed estimate and assess its quality through empirical demonstrations on a credit-scoring dataset.
