Table of Contents
Fetching ...

Near-Optimal Algorithms for Group Distributionally Robust Optimization and Beyond

Tasuku Soma, Khashayar Gatmiry, Sharut Gupta, Stefanie Jegelka

TL;DR

This paper unifies a broad class of distributionally robust optimization problems under generalized group DRO and provides near-optimal stochastic algorithms. By formulating DRO as a two-player zero-sum game and applying online gradient methods for the model and online mirror/descent for the group-weighting, the authors derive GDRO-EXP3P and GDRO-TINF with substantially improved convergence rates over prior work, plus a matching information-theoretic lower bound for group DRO. They extend the framework to weighted ranking via permutahedra, achieving comparable rates and efficiency, and demonstrate strong empirical gains on both convex benchmarks and deep learning tasks. The results offer a principled, scalable approach to fairness and robustness across multiple subpopulation settings, with practical implications for robust ML deployment.

Abstract

Distributionally robust optimization (DRO) can improve the robustness and fairness of learning methods. In this paper, we devise stochastic algorithms for a class of DRO problems including group DRO, subpopulation fairness, and empirical conditional value at risk (CVaR) optimization. Our new algorithms achieve faster convergence rates than existing algorithms for multiple DRO settings. We also provide a new information-theoretic lower bound that implies our bounds are tight for group DRO. Empirically, too, our algorithms outperform known methods.

Near-Optimal Algorithms for Group Distributionally Robust Optimization and Beyond

TL;DR

This paper unifies a broad class of distributionally robust optimization problems under generalized group DRO and provides near-optimal stochastic algorithms. By formulating DRO as a two-player zero-sum game and applying online gradient methods for the model and online mirror/descent for the group-weighting, the authors derive GDRO-EXP3P and GDRO-TINF with substantially improved convergence rates over prior work, plus a matching information-theoretic lower bound for group DRO. They extend the framework to weighted ranking via permutahedra, achieving comparable rates and efficiency, and demonstrate strong empirical gains on both convex benchmarks and deep learning tasks. The results offer a principled, scalable approach to fairness and robustness across multiple subpopulation settings, with practical implications for robust ML deployment.

Abstract

Distributionally robust optimization (DRO) can improve the robustness and fairness of learning methods. In this paper, we devise stochastic algorithms for a class of DRO problems including group DRO, subpopulation fairness, and empirical conditional value at risk (CVaR) optimization. Our new algorithms achieve faster convergence rates than existing algorithms for multiple DRO settings. We also provide a new information-theoretic lower bound that implies our bounds are tight for group DRO. Empirically, too, our algorithms outperform known methods.
Paper Structure (60 sections, 12 theorems, 58 equations, 2 figures, 4 tables, 5 algorithms)

This paper contains 60 sections, 12 theorems, 58 equations, 2 figures, 4 tables, 5 algorithms.

Key Result

Theorem 1

If $\eta_{\theta, t}$ is nonincreasing, Algorithm alg:general achieves the expected convergence rate for any fixed saddle point $(\theta^*, q^*)$.

Figures (2)

  • Figure 1: Results on Adult dataset for convex losses. Both axes are log-scale.
  • Figure 2: Results on the synthetic dataset for the convex regime. Both axes are log-scale

Theorems & Definitions (13)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5: Lower bound for group DRO
  • Lemma 1: Regret Bound of OMD; see, e.g., Orabona2019book
  • Lemma 2: Regret Bound of OGD
  • Lemma 3: Regret Bound of Hedge
  • Lemma 4: Regret Bound of Tsallis-INF
  • Theorem 6: see, e.g., Bubeck2012
  • ...and 3 more