Achievable distributional robustness when the robust risk is only partially identified
Julia Kostin, Nicola Gnecco, Fanny Yang
TL;DR
This work addresses distributional robustness when the robust risk is not fully identifiable, proposing the worst-case robust risk and a population minimax quantity to characterize the best achievable robustness under partial identifiability. It provides a concrete linear-SCM analysis showing how identifiability of training-shift directions governs robustness and demonstrates that existing finite robustness methods can be suboptimal when unseen shifts are present. The authors validate theory with synthetic and real-world gene-expression experiments, revealing that accounting for partial identifiability yields better generalization under distribution shifts. The results suggest a shift in how robustness is benchmarked, underscoring the importance of partially identifiable robustness and motivating extensions to nonlinear settings and active data collection strategies.
Abstract
In safety-critical applications, machine learning models should generalize well under worst-case distribution shifts, that is, have a small robust risk. Invariance-based algorithms can provably take advantage of structural assumptions on the shifts when the training distributions are heterogeneous enough to identify the robust risk. However, in practice, such identifiability conditions are rarely satisfied -- a scenario so far underexplored in the theoretical literature. In this paper, we aim to fill the gap and propose to study the more general setting when the robust risk is only partially identifiable. In particular, we introduce the worst-case robust risk as a new measure of robustness that is always well-defined regardless of identifiability. Its minimum corresponds to an algorithm-independent (population) minimax quantity that measures the best achievable robustness under partial identifiability. While these concepts can be defined more broadly, in this paper we introduce and derive them explicitly for a linear model for concreteness of the presentation. First, we show that existing robustness methods are provably suboptimal in the partially identifiable case. We then evaluate these methods and the minimizer of the (empirical) worst-case robust risk on real-world gene expression data and find a similar trend: the test error of existing robustness methods grows increasingly suboptimal as the fraction of data from unseen environments increases, whereas accounting for partial identifiability allows for better generalization.
