Evaluating and Correcting Performative Effects of Decision Support Systems via Causal Domain Shift
Philip Boeken, Onno Zoeter, Joris M. Mooij
TL;DR
This work treats the deployment of decision support predictions as a causal domain shift, introducing the domain indicator D to distinguish pre- and post-deployment in high-stakes settings. It defines deployment and retraining effects, and shows that evaluating these effects and correcting performative bias can be cast as domain adaptation problems solvable via domain pivots {X,Z} and a repeated regression estimator. The approach accommodates selection bias and missing labels, yielding identifiability results and a practical estimation strategy without requiring randomized deployments. By linking evaluation and bias correction under a unified causal framework, it offers a principled method to anticipate, monitor, and mitigate performative effects of DSSs in fields like healthcare and law. The framework thus supports responsible, data-driven deployment of predictive alarms by enabling pre- and post-deployment assessment and bias-aware retraining.
Abstract
When predicting a target variable $Y$ from features $X$, the prediction $\hat{Y}$ can be performative: an agent might act on this prediction, affecting the value of $Y$ that we eventually observe. Performative predictions are deliberately prevalent in algorithmic decision support, where a Decision Support System (DSS) provides a prediction for an agent to affect the value of the target variable. When deploying a DSS in high-stakes settings (e.g. healthcare, law, predictive policing, or child welfare screening) it is imperative to carefully assess the performative effects of the DSS. In the case that the DSS serves as an alarm for a predicted negative outcome, naive retraining of the prediction model is bound to result in a model that underestimates the risk, due to effective workings of the previous model. In this work, we propose to model the deployment of a DSS as causal domain shift and provide novel cross-domain identification results for the conditional expectation $E[Y | X]$, allowing for pre- and post-hoc assessment of the deployment of the DSS, and for retraining of a model that assesses the risk under a baseline policy where the DSS is not deployed. Using a running example, we empirically show that a repeated regression procedure provides a practical framework for estimating these quantities, even when the data is affected by sample selection bias and selective labelling, offering for a practical, unified solution for multiple forms of target variable bias.
