A Characterization of List Regression

Chirag Pabbaraju; Sahasrajit Sarmasarkar

A Characterization of List Regression

Chirag Pabbaraju, Sahasrajit Sarmasarkar

TL;DR

This work provides a complete PAC characterization of list regression, identifying when a real-valued hypothesis class is learnable with a short list of predictions under the absolute loss. It introduces two central dimensions: the $k$-OIG dimension for realizable list regression and the $k$-fat-shattering dimension for agnostic list regression, proving finiteness of these quantities is necessary and sufficient for learnability, with upper and lower bounds matching up to polylog factors. The authors develop a unified algorithmic framework based on discretization to a partial multiclass class, weak list learners via the one-inclusion graph, minimax boosting, and sample compression to achieve generalization guarantees for both realizable and agnostic settings. They also establish a sophisticated lower-bound calculus using higher-order packing and strong-fat-shattering concepts, linking these dimensions to packing numbers and discretization to relate fat-shattering with strong-fat-shattering. The work advances understanding of how list predictions extend regression theory beyond the standard $k=1$ setting and highlights open questions, such as whether ERM-based approaches can realize the agnostic list-learning guarantees via a carefully constructed list hypothesis class and its flattening.

Abstract

There has been a recent interest in understanding and characterizing the sample complexity of list learning tasks, where the learning algorithm is allowed to make a short list of $k$ predictions, and we simply require one of the predictions to be correct. This includes recent works characterizing the PAC sample complexity of standard list classification and online list classification. Adding to this theme, in this work, we provide a complete characterization of list PAC regression. We propose two combinatorial dimensions, namely the $k$-OIG dimension and the $k$-fat-shattering dimension, and show that they characterize realizable and agnostic $k$-list regression respectively. These quantities generalize known dimensions for standard regression. Our work thus extends existing list learning characterizations from classification to regression.

A Characterization of List Regression

TL;DR

-OIG dimension for realizable list regression and the

-fat-shattering dimension for agnostic list regression, proving finiteness of these quantities is necessary and sufficient for learnability, with upper and lower bounds matching up to polylog factors. The authors develop a unified algorithmic framework based on discretization to a partial multiclass class, weak list learners via the one-inclusion graph, minimax boosting, and sample compression to achieve generalization guarantees for both realizable and agnostic settings. They also establish a sophisticated lower-bound calculus using higher-order packing and strong-fat-shattering concepts, linking these dimensions to packing numbers and discretization to relate fat-shattering with strong-fat-shattering. The work advances understanding of how list predictions extend regression theory beyond the standard

setting and highlights open questions, such as whether ERM-based approaches can realize the agnostic list-learning guarantees via a carefully constructed list hypothesis class and its flattening.

Abstract

There has been a recent interest in understanding and characterizing the sample complexity of list learning tasks, where the learning algorithm is allowed to make a short list of

predictions, and we simply require one of the predictions to be correct. This includes recent works characterizing the PAC sample complexity of standard list classification and online list classification. Adding to this theme, in this work, we provide a complete characterization of list PAC regression. We propose two combinatorial dimensions, namely the

-OIG dimension and the

-fat-shattering dimension, and show that they characterize realizable and agnostic

-list regression respectively. These quantities generalize known dimensions for standard regression. Our work thus extends existing list learning characterizations from classification to regression.

Paper Structure (40 sections, 32 theorems, 152 equations, 4 algorithms)

This paper contains 40 sections, 32 theorems, 152 equations, 4 algorithms.

Introduction
Overview of Results
Preliminaries and Notation
PAC Learnability
Sample Compression
Agnostic List Regression
Upper Bound for Agnostic List Regression
Construction of a Partial Hypothesis Class part
Construction of a Weak Learner for the $k$-Threshold Class $\mathscr{T}_{\gamma,k}(\mathcal{H})$
Bounding the Training Error
Bounding the Generalization Error
Lower Bound for Agnostic List Regression
Stronger Notion of $k$-Fat-Shattering
Higher-Order Packing
Relating $k$-Ary Packing to $k$-Strong-Fat-Shattering
...and 25 more sections

Key Result

Theorem 1

A hypothesis class $\mathcal{H}$ is amenable to agnostic $k$-list regression if and only if its $k$-fat-shattering dimension is finite at all scales.

Theorems & Definitions (90)

Theorem 1: Informal, Agnostic List Regression, see Theorems \ref{['thm:agnostic-upper-bound']}, \ref{['thm:agnostic-lower-bound-quantitative']}
Theorem 2: Informal, Realizable List Regression, see Theorems \ref{['thm:realizable-ub']}, \ref{['thm:realizable-lb']}
Definition 1: Agnostic List Regression
Definition 2: $\gamma$-fat-shattering dimension fatshatteringdimension
Definition 3: Realizable List Regression
Definition 4: One-inclusion graph haussler1994predictingRUBINSTEIN200937
Definition 5: Orientation and scaled outdegree attias2023optimal
Definition 6: OIG dimension attias2023optimal
Definition 7: List Sample Compression
Lemma 2.1: Generalization by compression, essentially Theorem 30.2 in shalev2014understanding
...and 80 more

A Characterization of List Regression

TL;DR

Abstract

A Characterization of List Regression

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (90)