Table of Contents
Fetching ...

Robustness Beyond Known Groups with Low-rank Adaptation

Abinitha Gourabathina, Hyewon Jeong, Teya Bergamaschi, Marzyeh Ghassemi, Collin Stultz

TL;DR

This work addresses the challenge of ensuring model fairness and robustness when sensitive subgroups are unknown or unlabeled. It introduces LEIA, a two-stage method that identifies a low-rank error-informed subspace in the representation space via an error-weighted covariance and applies a constrained logit-space adjustment to correct latent failure modes without modifying the backbone or requiring group labels. By evaluating on five real-world datasets under no-, partial-, and full-knowledge of subgroup relevance, LEIA consistently improves worst-group performance while remaining fast and parameter-efficient. The approach highlights the importance of evaluating robustness in realistic settings with unknown groups and demonstrates practical gains in scenarios where subgroup annotations are unavailable or incomplete.

Abstract

Deep learning models trained to optimize average accuracy often exhibit systematic failures on particular subpopulations. In real world settings, the subpopulations most affected by such disparities are frequently unlabeled or unknown, thereby motivating the development of methods that are performant on sensitive subgroups without being pre-specified. However, existing group-robust methods typically assume prior knowledge of relevant subgroups, using group annotations for training or model selection. We propose Low-rank Error Informed Adaptation (LEIA), a simple two-stage method that improves group robustness by identifying a low-dimensional subspace in the representation space where model errors concentrate. LEIA restricts adaptation to this error-informed subspace via a low-rank adjustment to the classifier logits, directly targeting latent failure modes without modifying the backbone or requiring group labels. Using five real-world datasets, we analyze group robustness under three settings: (1) truly no knowledge of subgroup relevance, (2) partial knowledge of subgroup relevance, and (3) full knowledge of subgroup relevance. Across all settings, LEIA consistently improves worst-group performance while remaining fast, parameter-efficient, and robust to hyperparameter choice.

Robustness Beyond Known Groups with Low-rank Adaptation

TL;DR

This work addresses the challenge of ensuring model fairness and robustness when sensitive subgroups are unknown or unlabeled. It introduces LEIA, a two-stage method that identifies a low-rank error-informed subspace in the representation space via an error-weighted covariance and applies a constrained logit-space adjustment to correct latent failure modes without modifying the backbone or requiring group labels. By evaluating on five real-world datasets under no-, partial-, and full-knowledge of subgroup relevance, LEIA consistently improves worst-group performance while remaining fast and parameter-efficient. The approach highlights the importance of evaluating robustness in realistic settings with unknown groups and demonstrates practical gains in scenarios where subgroup annotations are unavailable or incomplete.

Abstract

Deep learning models trained to optimize average accuracy often exhibit systematic failures on particular subpopulations. In real world settings, the subpopulations most affected by such disparities are frequently unlabeled or unknown, thereby motivating the development of methods that are performant on sensitive subgroups without being pre-specified. However, existing group-robust methods typically assume prior knowledge of relevant subgroups, using group annotations for training or model selection. We propose Low-rank Error Informed Adaptation (LEIA), a simple two-stage method that improves group robustness by identifying a low-dimensional subspace in the representation space where model errors concentrate. LEIA restricts adaptation to this error-informed subspace via a low-rank adjustment to the classifier logits, directly targeting latent failure modes without modifying the backbone or requiring group labels. Using five real-world datasets, we analyze group robustness under three settings: (1) truly no knowledge of subgroup relevance, (2) partial knowledge of subgroup relevance, and (3) full knowledge of subgroup relevance. Across all settings, LEIA consistently improves worst-group performance while remaining fast, parameter-efficient, and robust to hyperparameter choice.
Paper Structure (51 sections, 3 theorems, 21 equations, 8 figures, 13 tables, 1 algorithm)

This paper contains 51 sections, 3 theorems, 21 equations, 8 figures, 13 tables, 1 algorithm.

Key Result

Proposition 3.3

Let $\mathcal{H}_\mathrm{DRO}(\mathcal{G}') \subseteq \mathcal{H}$ (where $\mathcal{G}' \subseteq \mathcal{G}$) denote the set of hypotheses that minimize the DRO objective over the known groups $G \in \mathcal{G}'$. Suppose we are given a latent partition $\mathcal{G} = \{G_1, G_2, \dots G_K\}$. Th

Figures (8)

  • Figure 1: Low-rank Error Informed Adaptation in action. After training with ERM, LEIA adjusts ERM's decision boundary along a low-rank error subspace. A) High dimensional embedding space projected in three dimensions via PCA, showing an entangled cluster of binary classification problem with two subgroups ($\circ$) and (☆). B) and C) ERM's boundary in the high variance directions compared to LEIA's adjusted boundary in the identified error direction (eigenvectors). Yellow stars indicate misclassified points of the worst-performing subgroup. D) Probability adjustments according to the identified principal eigenvector. Samples from the worst group (☆) that are initially misclassified with ERM are corrected by LEIA directly in logit space. We show the improved worst group accuracy in this synthetic data setup.
  • Figure 2: Low-rank Error Structure across datasets. The graphs show the cumulative explained variance (CEV) of the top-$k$ eigenvectors of the error-covariance matrix for (a) CelebA; (b) CivilComments; (c) MultiNLI; and (d) Waterbirds. CEV is defined as ${\sum_{i=1}^{k} \lambda_i / \sum_{i=1} \lambda_i}$. The range of $k$ for which the CEV is between 50% and 90% is shaded. The visualization demonstrates that (i) LEIA meaningfully learns an error-informed subspace; and (ii) this subspace is low-rank with respect to the embedding dimension (2048 for image datasets and 728 for text datasets).
  • Figure 3: Training time (in minutes) across different datasets. LEIA only incurs a small overhead compared to standard ERM and is substantially less expensive than recent methods like JTT, DPE, and GIC.
  • Figure 4: Robustness of LEIA to rank $k$ parameter: We show how LEIA has strong test WGA performance throughout the $k$ tuning region identified in Figure \ref{['fig:low_rank_errors']}. Performance is averaged across 3 seeds.
  • Figure 5: Performance trends as a function of unknown group size ratio. Left: Unknown group accuracy for ERM and Group DRO averaged across all numbers of known groups. ERM accuracy improves with larger unknown group size (more training data), while Group DRO accuracy remains relatively constant or decreases. Right: Harm (ERM - Group DRO) increases monotonically with unknown group size ratio, demonstrating that Group DRO's strategy of ignoring unknown groups becomes increasingly problematic as unknown groups become more prevalent in the data.
  • ...and 3 more figures

Theorems & Definitions (9)

  • Definition 3.1: Empirical Risk Minimization (ERM)
  • Definition 3.2: Group Distributionally Robust Optimization (Group DRO) sagawa2020GDRO
  • Proposition 3.3
  • proof : Proof sketch
  • Lemma 2.1
  • proof
  • proof
  • Proposition 5.1: Spectral Optimality
  • proof