Table of Contents
Fetching ...

Multi-group Learning for Hierarchical Groups

Samuel Deng, Daniel Hsu

TL;DR

This work designs an algorithm that outputs an interpretable and deterministic decision tree predictor with near-optimal sample complexity for the natural case where the groups are hierarchically structured and achieves attractive generalization properties on real datasets with hierarchical group structure.

Abstract

The multi-group learning model formalizes the learning scenario in which a single predictor must generalize well on multiple, possibly overlapping subgroups of interest. We extend the study of multi-group learning to the natural case where the groups are hierarchically structured. We design an algorithm for this setting that outputs an interpretable and deterministic decision tree predictor with near-optimal sample complexity. We then conduct an empirical evaluation of our algorithm and find that it achieves attractive generalization properties on real datasets with hierarchical group structure.

Multi-group Learning for Hierarchical Groups

TL;DR

This work designs an algorithm that outputs an interpretable and deterministic decision tree predictor with near-optimal sample complexity for the natural case where the groups are hierarchically structured and achieves attractive generalization properties on real datasets with hierarchical group structure.

Abstract

The multi-group learning model formalizes the learning scenario in which a single predictor must generalize well on multiple, possibly overlapping subgroups of interest. We extend the study of multi-group learning to the natural case where the groups are hierarchically structured. We design an algorithm for this setting that outputs an interpretable and deterministic decision tree predictor with near-optimal sample complexity. We then conduct an empirical evaluation of our algorithm and find that it achieves attractive generalization properties on real datasets with hierarchical group structure.
Paper Structure (23 sections, 13 theorems, 48 equations, 29 figures, 5 tables, 2 algorithms)

This paper contains 23 sections, 13 theorems, 48 equations, 29 figures, 5 tables, 2 algorithms.

Key Result

Theorem 3.1

Let $\mathcal{H}$ be a hypothesis class and let $\mathcal{G}$ be a collection of hierarchically structured groups with leaf nodes $g_1, \dots, g_N$ partitioning the input space $\mathcal{X}.$ Let $\ell(\cdot, \cdot) \in [0, 1]$ be any bounded loss function. Then, with probability $1 - \delta$ over $ for any $g \in \mathcal{G}$ such that $g = \bigcup_{i = 1}^k g_i$ and $\epsilon_g(n, \delta) := 9\s

Figures (29)

  • Figure 1: No best $h$ for all groups simultaneously. Letting $\mathcal{H}$ be the class of halfspaces, the groups $g_1$ (indicated by the green solid line) and $g_2$ (indicated by the red dotted line) overlap, but their optimal predictors $h_{g_1}$ and $h_{g_2}$ are much different.
  • Figure 2: Example of a hierarchically structured tree. Each level of the tree above corresponds to a demographic attribute ($\texttt{race}$, $\texttt{sex}$, and $\texttt{age}$). Proceeding down the tree yields increasingly granular subgroups. The leaves are the most granular level, with subgroups such as $\texttt{R6+} \land \texttt{male} \land \texttt{age} < 35$.
  • Figure 3: Test accuracy on race-sex-age groups for CA Employment (top row) and CA Income (bottom row). Each point in the plot represents the test error on a specific group. The $y = x$ line represents equal error between our MGL-Tree and the competing method; points above the $y = x$ line are groups where MGL-Tree exhibits better generalization.
  • Figure 4: Let $g_1$, the yellow node, be a group that Algorithm \ref{['alg:tree']} has already seen. Suppose $f$ updates on $g_2.$ We see that $g_1$ is on the path from $g_2$ to the root. We need to show that the inequality for $g_1$ is not violated after the update.
  • Figure 5: Test accuracy of $\mathcal{H} =$ Logistic Regression for race-sex-age groups (CA Employment). Test errors across all $|\mathcal{G}| = 54$ hierarchically structured groups from race-sex-age. See Appendix \ref{['sec:dataset_details']} for more information on the specific categories for each group.
  • ...and 24 more figures

Theorems & Definitions (24)

  • Definition 2.1: Multi-group learning property
  • Definition 2.2: Hierarchically Structured Groups
  • Definition 2.3: Hierarchical Tree
  • Theorem 3.1
  • Theorem 3.2
  • Lemma 3.3
  • Theorem 3.4: Correctness of MGL-Tree
  • Lemma 1.1: Theorem 1 from tosh_simple_2022
  • Lemma 1.2
  • proof
  • ...and 14 more