Learning Models with Uniform Performance via Distributionally Robust Optimization
John Duchi, Hongseok Namkoong
TL;DR
The paper develops a convex distributionally robust optimization framework to achieve uniform performance under distributional shifts and latent subpopulations. By formulating robustness via f-divergence balls around the nominal distribution, it links worst-case risk to tail performance and derives a tractable plug-in estimator with a dual formulation. The authors provide finite-sample convergence guarantees, minimax lower bounds, and asymptotic normality results, clarifying the statistical costs and tradeoffs of robustness. Empirical studies across domain adaptation, tail performance, and fine-grained subpopulations demonstrate improved tail and subpopulation performance at a controlled cost to average performance, with practical heuristics for choosing the robustness parameters. The work offers a principled, scalable approach to robust learning with theoretical guarantees and broad applicability to safety- and fairness-critical tasks.
Abstract
A common goal in statistics and machine learning is to learn models that can perform well against distributional shifts, such as latent heterogeneous subpopulations, unknown covariate shifts, or unmodeled temporal effects. We develop and analyze a distributionally robust stochastic optimization (DRO) framework that learns a model providing good performance against perturbations to the data-generating distribution. We give a convex formulation for the problem, providing several convergence guarantees. We prove finite-sample minimax upper and lower bounds, showing that distributional robustness sometimes comes at a cost in convergence rates. We give limit theorems for the learned parameters, where we fully specify the limiting distribution so that confidence intervals can be computed. On real tasks including generalizing to unknown subpopulations, fine-grained recognition, and providing good tail performance, the distributionally robust approach often exhibits improved performance.
