Table of Contents
Fetching ...

Efficient Discrepancy Testing for Learning with Distribution Shift

Gautam Chandrasekaran, Adam R. Klivans, Vasilis Kontonis, Konstantinos Stavropoulos, Arsen Vasilyan

TL;DR

The paper tackles learning under distribution shift by reframing the problem through discrepancy distance and introducing efficient, localized discrepancy testers that enable provable guarantees in the Testable Learning with Distribution Shift (TDS) framework. It advances the theory by replacing prior $L_2$-sandwiching requirements with $L_1$-sandwiching, yielding exponential gains for constant-depth circuits and first results for degree-2 polynomial threshold functions. The authors develop universal TDS learners for broad distribution families (including convex sets with low intrinsic dimension and intersections of halfspaces) and provide fully polynomial-time testing for key classes, including balanced intersections of halfspaces. A central thread is the Chow-matching tester, combined with subspace-structured testers (cylindrical grids) and boundary-proximity techniques, enabling end-to-end TDS learning with decoupled training and testing. The work also establishes NP-hardness for global discrepancy testing, underscoring the practical relevance of localized testing for tractable certifiable learning under distribution shift.

Abstract

A fundamental notion of distance between train and test distributions from the field of domain adaptation is discrepancy distance. While in general hard to compute, here we provide the first set of provably efficient algorithms for testing localized discrepancy distance, where discrepancy is computed with respect to a fixed output classifier. These results imply a broad set of new, efficient learning algorithms in the recently introduced model of Testable Learning with Distribution Shift (TDS learning) due to Klivans et al. (2023). Our approach generalizes and improves all prior work on TDS learning: (1) we obtain universal learners that succeed simultaneously for large classes of test distributions, (2) achieve near-optimal error rates, and (3) give exponential improvements for constant depth circuits. Our methods further extend to semi-parametric settings and imply the first positive results for low-dimensional convex sets. Additionally, we separate learning and testing phases and obtain algorithms that run in fully polynomial time at test time.

Efficient Discrepancy Testing for Learning with Distribution Shift

TL;DR

The paper tackles learning under distribution shift by reframing the problem through discrepancy distance and introducing efficient, localized discrepancy testers that enable provable guarantees in the Testable Learning with Distribution Shift (TDS) framework. It advances the theory by replacing prior -sandwiching requirements with -sandwiching, yielding exponential gains for constant-depth circuits and first results for degree-2 polynomial threshold functions. The authors develop universal TDS learners for broad distribution families (including convex sets with low intrinsic dimension and intersections of halfspaces) and provide fully polynomial-time testing for key classes, including balanced intersections of halfspaces. A central thread is the Chow-matching tester, combined with subspace-structured testers (cylindrical grids) and boundary-proximity techniques, enabling end-to-end TDS learning with decoupled training and testing. The work also establishes NP-hardness for global discrepancy testing, underscoring the practical relevance of localized testing for tractable certifiable learning under distribution shift.

Abstract

A fundamental notion of distance between train and test distributions from the field of domain adaptation is discrepancy distance. While in general hard to compute, here we provide the first set of provably efficient algorithms for testing localized discrepancy distance, where discrepancy is computed with respect to a fixed output classifier. These results imply a broad set of new, efficient learning algorithms in the recently introduced model of Testable Learning with Distribution Shift (TDS learning) due to Klivans et al. (2023). Our approach generalizes and improves all prior work on TDS learning: (1) we obtain universal learners that succeed simultaneously for large classes of test distributions, (2) achieve near-optimal error rates, and (3) give exponential improvements for constant depth circuits. Our methods further extend to semi-parametric settings and imply the first positive results for low-dimensional convex sets. Additionally, we separate learning and testing phases and obtain algorithms that run in fully polynomial time at test time.
Paper Structure (61 sections, 47 theorems, 59 equations, 3 figures, 3 tables, 3 algorithms)

This paper contains 61 sections, 47 theorems, 59 equations, 3 figures, 3 tables, 3 algorithms.

Key Result

Theorem 3.1

Let $\epsilon,\delta\in (0,1)$ and let $\mathcal{C}\subseteq\{\mathcal{X}\to \{\pm 1\}^{}\}$ be a concept class such that the $\epsilon$-approximate $\mathcal{L}_1$-sandwiching degree of $\mathcal{C}$ under $\mathcal{D}$ is $\ell(\epsilon) \in\mathbb{N}$. Then, there exists a TDS learning algorithm

Figures (3)

  • Figure 1: If $\mathbf{x}$ lies within a balanced convex set $\mathcal{K}$, then many points close to $\mathbf{x}$ lie within $\mathcal{K}$ as well, i.e., there is a cone $\mathcal{R}'$ with $\mathcal{R}'\subseteq\mathbb{B}(\mathbf{x},\varrho) \cap \mathcal{K}$, where $\mathbb{B}(\mathbf{x},\varrho)$ is a ball around $\mathbf{x}$. The ball centered at $\mathbf{x}_c$ exists due to the fact that $\mathcal{K}$ is balanced: any balanced convex set contains some ball with non-negligible radius. The convex hull of $\mathbf{x}$ and the ball at $\mathbf{x}_c$ lies within $\mathcal{K}$. (See also Fig. \ref{['figure:localization-of-convex-disagreement']})
  • Figure 2: Discretization of smooth boundary
  • Figure 3: If $\mathbf{x}\in \mathcal{K}$, then there is a cone $\mathcal{R}'\subseteq\mathbb{B}_k(\mathbf{x},\varrho) \cap \mathcal{K}$

Theorems & Definitions (111)

  • Definition 1.1: Localized Discrepancy
  • Definition 2.1: Testing Localized Discrepancy
  • Definition 2.2: Universal TDS Learning
  • Theorem 3.1: $\mathcal{L}_1$-sandwiching implies TDS learning
  • Proposition 3.2: Informal, see \ref{['thm:chow_matching']}
  • Theorem 3.3: TDS Learning of Convex Subspace Juntas
  • Theorem 3.4: Universal TDS Learning of Convex Subspace Juntas
  • Theorem 3.5: Universal TDS Learning of Balanced Intersections
  • Theorem 3.6: TDS Learning of Balanced Intersections
  • Remark 3.7
  • ...and 101 more