Evaluating Prediction-based Interventions with Human Decision Makers In Mind
Inioluwa Deborah Raji, Lydia Liu
TL;DR
This work tackles the challenge of evaluating prediction-based interventions by treating the human decision-maker as an active mediator rather than a passive receiver of predictions. It introduces a causal model with latent judge state variables $J_{i,k}$ and three bias channels—treatment exposure, capacity constraints, and low trust—and proves that standard SUTVA-based analyses can fail under these dynamics. Through semi-synthetic experiments grounded in real-world data, the paper demonstrates how experimental design choices, such as treatment assignment and prediction thresholds, can significantly bias estimated average treatment effects, with pronounced effects for certain subgroups. The findings argue for multi-judge, two-level randomization designs and multi-factor experiments to improve the scientific validity and generalizability of evaluation schemes for algorithm-assisted decision systems across domains. These insights have practical implications for designing credible ADS evaluations in criminal justice, healthcare, education, and beyond.
Abstract
Automated decision systems (ADS) are broadly deployed to inform and support human decision-making across a wide range of consequential settings. However, various context-specific details complicate the goal of establishing meaningful experimental evaluations for prediction-based interventions. Notably, current experiment designs rely on simplifying assumptions about human decision making in order to derive causal estimates. In reality, specific experimental design decisions may induce cognitive biases in human decision makers, which could then significantly alter the observed effect sizes of the prediction intervention. In this paper, we formalize and investigate various models of human decision-making in the presence of a predictive model aid. We show that each of these behavioural models produces dependencies across decision subjects and results in the violation of existing assumptions, with consequences for treatment effect estimation. This work aims to further advance the scientific validity of intervention-based evaluation schemes for the assessment of ADS deployments.
