Statistical Analysis of Conditional Group Distributionally Robust Optimization with Cross-Entropy Loss
Zijian Guo, Zhenyu Wang, Yifan Hu, Francis Bach
TL;DR
This work develops Conditional Group Distributionally Robust Optimization (CG-DRO) for multi-source unsupervised domain adaptation with cross-entropy loss, addressing distributional shifts across domains by optimizing over mixtures of source conditional distributions. It introduces a Mirror Prox algorithm augmented with Double Machine Learning to estimate the risk while maintaining high statistical efficiency, and proves fast convergence rates through surrogate minimax problems. Recognizing nonstandard limiting distributions in minimax settings, the authors formulate a perturbation-based inference framework that yields uniformly valid confidence intervals and tests, even when the empirical CG-DRO estimator is nonnormal. Theoretical results are complemented by simulations demonstrating estimation accuracy, nonregular/unstable behavior, and valid uncertainty quantification, with practical implications for robust transfer learning under domain shifts.
Abstract
In multi-source learning with discrete labels, distributional heterogeneity across domains poses a central challenge to developing predictive models that transfer reliably to unseen domains. We study multi-source unsupervised domain adaptation, where labeled data are available from multiple source domains and only unlabeled data are observed from the target domain. To address potential distribution shifts, we propose a novel Conditional Group Distributionally Robust Optimization (CG-DRO) framework that learns a classifier by minimizing the worst-case cross-entropy loss over the convex combinations of the conditional outcome distributions from sources domains. We develop an efficient Mirror Prox algorithm for solving the minimax problem and employ a double machine learning procedure to estimate the risk function, ensuring that errors in nuisance estimation contribute only at higher-order rates. We establish fast statistical convergence rates for the empirical CG-DRO estimator by constructing two surrogate minimax optimization problems that serve as theoretical bridges. A distinguishing challenge for CG-DRO is the emergence of nonstandard asymptotics: the empirical CG-DRO estimator may fail to converge to a standard limiting distribution due to boundary effects and system instability. To address this, we introduce a perturbation-based inference procedure that enables uniformly valid inference, including confidence interval construction and hypothesis testing.
