Table of Contents
Fetching ...

Learning With Multi-Group Guarantees For Clusterable Subpopulations

Jessica Dai, Nika Haghtalab, Eric Zhao

TL;DR

This work addresses the challenge of providing per-subpopulation guarantees when subgroups are endogenously defined by distributional structure rather than predefined features. It introduces a multi-objective online calibration framework that leverages multicalibration to produce guarantees across a covering of plausible clusterings, achieving a $O(T^{1/2})$ rate without requiring cluster separability. It also analyzes a cluster-then-predict baseline, showing its $O(T^{2/3})$ rate and separation dependencies, and demonstrates that per-subgroup guarantees can be easier to obtain than exact subgroup learning, especially for discriminant calibration. Theoretical results cover exponential-family and Gaussian-mixture subgroups, with implications for fairness and auditing in real-world prediction tasks, and point to broad generalizations beyond calibration to other Blackwell-approachability settings.

Abstract

A canonical desideratum for prediction problems is that performance guarantees should hold not just on average over the population, but also for meaningful subpopulations within the overall population. But what constitutes a meaningful subpopulation? In this work, we take the perspective that relevant subpopulations should be defined with respect to the clusters that naturally emerge from the distribution of individuals for which predictions are being made. In this view, a population refers to a mixture model whose components constitute the relevant subpopulations. We suggest two formalisms for capturing per-subgroup guarantees: first, by attributing each individual to the component from which they were most likely drawn, given their features; and second, by attributing each individual to all components in proportion to their relative likelihood of having been drawn from each component. Using online calibration as a case study, we study a multi-objective algorithm that provides guarantees for each of these formalisms by handling all plausible underlying subpopulation structures simultaneously, and achieve an $O(T^{1/2})$ rate even when the subpopulations are not well-separated. In comparison, the more natural cluster-then-predict approach that first recovers the structure of the subpopulations and then makes predictions suffers from a $O(T^{2/3})$ rate and requires the subpopulations to be separable. Along the way, we prove that providing per-subgroup calibration guarantees for underlying clusters can be easier than learning the clusters: separation between median subgroup features is required for the latter but not the former.

Learning With Multi-Group Guarantees For Clusterable Subpopulations

TL;DR

This work addresses the challenge of providing per-subpopulation guarantees when subgroups are endogenously defined by distributional structure rather than predefined features. It introduces a multi-objective online calibration framework that leverages multicalibration to produce guarantees across a covering of plausible clusterings, achieving a rate without requiring cluster separability. It also analyzes a cluster-then-predict baseline, showing its rate and separation dependencies, and demonstrates that per-subgroup guarantees can be easier to obtain than exact subgroup learning, especially for discriminant calibration. Theoretical results cover exponential-family and Gaussian-mixture subgroups, with implications for fairness and auditing in real-world prediction tasks, and point to broad generalizations beyond calibration to other Blackwell-approachability settings.

Abstract

A canonical desideratum for prediction problems is that performance guarantees should hold not just on average over the population, but also for meaningful subpopulations within the overall population. But what constitutes a meaningful subpopulation? In this work, we take the perspective that relevant subpopulations should be defined with respect to the clusters that naturally emerge from the distribution of individuals for which predictions are being made. In this view, a population refers to a mixture model whose components constitute the relevant subpopulations. We suggest two formalisms for capturing per-subgroup guarantees: first, by attributing each individual to the component from which they were most likely drawn, given their features; and second, by attributing each individual to all components in proportion to their relative likelihood of having been drawn from each component. Using online calibration as a case study, we study a multi-objective algorithm that provides guarantees for each of these formalisms by handling all plausible underlying subpopulation structures simultaneously, and achieve an rate even when the subpopulations are not well-separated. In comparison, the more natural cluster-then-predict approach that first recovers the structure of the subpopulations and then makes predictions suffers from a rate and requires the subpopulations to be separable. Along the way, we prove that providing per-subgroup calibration guarantees for underlying clusters can be easier than learning the clusters: separation between median subgroup features is required for the latter but not the former.

Paper Structure

This paper contains 30 sections, 15 theorems, 56 equations, 1 figure, 4 algorithms.

Key Result

Proposition 3.1

Let $f$ be an unknown endogenous subgroups model whose Gaussian components are isotropic with $\|\mu_1-\mu_2\| \geq \gamma$. Then, with probability $1-\delta$, the Cluster-Then-Predict algorithm attains discriminant calibration error of $O(d^{1/3} T^{2/3} \gamma^{-4/3} + \sqrt{T \log(1/\delta)}),$ w

Figures (1)

  • Figure 1: Illustration of the non-linear boundaries that arise in the cluster assignment functions of non-isotropic Gaussian mixtures for $k = 2$. Means of each component are marked with red stars.

Theorems & Definitions (35)

  • Definition 1: Discriminant Calibration Error
  • Definition 2: Likelihood Calibration Error
  • Definition 3: bartlett99, Chapter 11
  • Definition 4
  • Proposition 3.1
  • proof : Proof Sketch.
  • Remark 3.2
  • Remark 3.3
  • Proposition 3.4
  • Theorem 4.1
  • ...and 25 more