Near-Optimal Algorithms for Group Distributionally Robust Optimization and Beyond

Tasuku Soma; Khashayar Gatmiry; Sharut Gupta; Stefanie Jegelka

Near-Optimal Algorithms for Group Distributionally Robust Optimization and Beyond

Tasuku Soma, Khashayar Gatmiry, Sharut Gupta, Stefanie Jegelka

TL;DR

This paper unifies a broad class of distributionally robust optimization problems under generalized group DRO and provides near-optimal stochastic algorithms. By formulating DRO as a two-player zero-sum game and applying online gradient methods for the model and online mirror/descent for the group-weighting, the authors derive GDRO-EXP3P and GDRO-TINF with substantially improved convergence rates over prior work, plus a matching information-theoretic lower bound for group DRO. They extend the framework to weighted ranking via permutahedra, achieving comparable rates and efficiency, and demonstrate strong empirical gains on both convex benchmarks and deep learning tasks. The results offer a principled, scalable approach to fairness and robustness across multiple subpopulation settings, with practical implications for robust ML deployment.

Abstract

Distributionally robust optimization (DRO) can improve the robustness and fairness of learning methods. In this paper, we devise stochastic algorithms for a class of DRO problems including group DRO, subpopulation fairness, and empirical conditional value at risk (CVaR) optimization. Our new algorithms achieve faster convergence rates than existing algorithms for multiple DRO settings. We also provide a new information-theoretic lower bound that implies our bounds are tight for group DRO. Empirically, too, our algorithms outperform known methods.

Near-Optimal Algorithms for Group Distributionally Robust Optimization and Beyond

TL;DR

Abstract

Paper Structure (60 sections, 12 theorems, 58 equations, 2 figures, 4 tables, 5 algorithms)

This paper contains 60 sections, 12 theorems, 58 equations, 2 figures, 4 tables, 5 algorithms.

Introduction
Contributions.
Our techniques
Related work
Notations.
Examples contained in generalized group DRO
Group DRO.
Empirical CVaR, Subpopulation fairness, Average top-$k$ worst group loss.
Weighted ranking of group losses.
Algorithms
Algorithm for the general case
Regularizer.
Step sizes.
Projection step.
Algorithms for Group DRO
...and 45 more sections

Key Result

Theorem 1

If $\eta_{\theta, t}$ is nonincreasing, Algorithm alg:general achieves the expected convergence rate for any fixed saddle point $(\theta^*, q^*)$.

Figures (2)

Figure 1: Results on Adult dataset for convex losses. Both axes are log-scale.
Figure 2: Results on the synthetic dataset for the convex regime. Both axes are log-scale

Theorems & Definitions (13)

Theorem 1
Theorem 2
Theorem 3
Theorem 4
Theorem 5: Lower bound for group DRO
Lemma 1: Regret Bound of OMD; see, e.g., Orabona2019book
Lemma 2: Regret Bound of OGD
Lemma 3: Regret Bound of Hedge
Lemma 4: Regret Bound of Tsallis-INF
Theorem 6: see, e.g., Bubeck2012
...and 3 more

Near-Optimal Algorithms for Group Distributionally Robust Optimization and Beyond

TL;DR

Abstract

Near-Optimal Algorithms for Group Distributionally Robust Optimization and Beyond

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (13)