Risk Measures and Upper Probabilities: Coherence and Stratification
Christian Fröhlich, Robert C. Williamson
TL;DR
The paper argues for replacing the standard expectation with coherent risk measures to capture risk aversion and ambiguity in ML under distributional shift and fairness concerns. By embedding coherent risk measures in rearrangement invariant Banach spaces and leveraging Kusuoka representations, it characterizes spectral risk measures through fundamental functions, linking CVaR, distortion measures, and Choquet integrals. It shows that the Lorentz and Marcinkiewicz norms bound all law-invariant coherent risks with a given fundamental function, thereby providing a natural tail-focused stratification of risk measures. The authors develop interpolation tools to construct new risk measures from old ones, analyze tail behavior via φ′(0), and demonstrate empirically that spectral risk measures improve robustness and reduce inequality in losses, albeit with trade-offs in average performance. Overall, the work provides a principled, interpretable framework to quantify and combine risk and uncertainty in ML using spectral risk measures and ri-space theory, with practical implications for robustness and fairness.
Abstract
Machine learning typically presupposes classical probability theory which implies that aggregation is built upon expectation. There are now multiple reasons to motivate looking at richer alternatives to classical probability theory as a mathematical foundation for machine learning. We systematically examine a powerful and rich class of alternative aggregation functionals, known variously as spectral risk measures, Choquet integrals or Lorentz norms. We present a range of characterization results, and demonstrate what makes this spectral family so special. In doing so we arrive at a natural stratification of all coherent risk measures in terms of the upper probabilities that they induce by exploiting results from the theory of rearrangement invariant Banach spaces. We empirically demonstrate how this new approach to uncertainty helps tackling practical machine learning problems.
