Distribution-Free Statistical Dispersion Control for Societal Applications
Zhun Deng, Thomas P. Zollo, Jake C. Snell, Toniann Pitassi, Richard Zemel
TL;DR
This work tackles the challenge of providing distribution-free guarantees for dispersion of predictive losses across a population, not just expected loss. It develops a two-step framework: first obtain two-sided confidence bounds on the loss CDF $F$ from validation data, then propagate these bounds to nonlinear and group-based dispersion functionals such as the Gini coefficient, Atkinson index, and CVaR-based fairness metrics. A novel optimization procedure is introduced to tighten bounds in data-scarce regimes, including a neural-parameterized approach for selecting bound thresholds and a post-processing step to guarantee the distribution-free constraints. The authors validate the framework on toxic-comment detection, medical-imaging, and recommendation tasks, showing improved fairness-oriented model selection and tighter, more reliable bounds on societal dispersion measures. This advances responsible ML by equipping practitioners with robust, interpretable guarantees for distribution-wide equity metrics in high-stakes applications.
Abstract
Explicit finite-sample statistical guarantees on model performance are an important ingredient in responsible machine learning. Previous work has focused mainly on bounding either the expected loss of a predictor or the probability that an individual prediction will incur a loss value in a specified range. However, for many high-stakes applications, it is crucial to understand and control the dispersion of a loss distribution, or the extent to which different members of a population experience unequal effects of algorithmic decisions. We initiate the study of distribution-free control of statistical dispersion measures with societal implications and propose a simple yet flexible framework that allows us to handle a much richer class of statistical functionals beyond previous work. Our methods are verified through experiments in toxic comment detection, medical imaging, and film recommendation.
