Efficient Discrepancy Testing for Learning with Distribution Shift
Gautam Chandrasekaran, Adam R. Klivans, Vasilis Kontonis, Konstantinos Stavropoulos, Arsen Vasilyan
TL;DR
The paper tackles learning under distribution shift by reframing the problem through discrepancy distance and introducing efficient, localized discrepancy testers that enable provable guarantees in the Testable Learning with Distribution Shift (TDS) framework. It advances the theory by replacing prior $L_2$-sandwiching requirements with $L_1$-sandwiching, yielding exponential gains for constant-depth circuits and first results for degree-2 polynomial threshold functions. The authors develop universal TDS learners for broad distribution families (including convex sets with low intrinsic dimension and intersections of halfspaces) and provide fully polynomial-time testing for key classes, including balanced intersections of halfspaces. A central thread is the Chow-matching tester, combined with subspace-structured testers (cylindrical grids) and boundary-proximity techniques, enabling end-to-end TDS learning with decoupled training and testing. The work also establishes NP-hardness for global discrepancy testing, underscoring the practical relevance of localized testing for tractable certifiable learning under distribution shift.
Abstract
A fundamental notion of distance between train and test distributions from the field of domain adaptation is discrepancy distance. While in general hard to compute, here we provide the first set of provably efficient algorithms for testing localized discrepancy distance, where discrepancy is computed with respect to a fixed output classifier. These results imply a broad set of new, efficient learning algorithms in the recently introduced model of Testable Learning with Distribution Shift (TDS learning) due to Klivans et al. (2023). Our approach generalizes and improves all prior work on TDS learning: (1) we obtain universal learners that succeed simultaneously for large classes of test distributions, (2) achieve near-optimal error rates, and (3) give exponential improvements for constant depth circuits. Our methods further extend to semi-parametric settings and imply the first positive results for low-dimensional convex sets. Additionally, we separate learning and testing phases and obtain algorithms that run in fully polynomial time at test time.
