PAC Learning is just Bipartite Matching (Sort of)
Shaddin Dughmi
TL;DR
The paper argues that PAC learning can be understood through bipartite matching by employing a transductive learning model and one-inclusion graphs, recasting multiclass and other loss settings as matching problems. It develops a precise graph-theoretic characterization of optimal transductive learning via Hall complexity and extends the framework to general loss functions using Functional Dependency Structures (FDS), establishing a compactness principle that ties sample complexity to finite projections. It also outlines algorithmic templates derived from the matching viewpoint, including local regularization and unsupervised pre-training, offering a practical pathway for near-optimal multiclass learning. The work highlights equivalences and gaps between transductive and PAC models, proposes local computation avenues, and suggests future directions for unifying combinatorial optimization with learning theory through Hall-type and matching-based analyses.
Abstract
The main goal of this article is to convince you, the reader, that supervised learning in the Probably Approximately Correct (PAC) model is closely related to -- of all things -- bipartite matching! En-route from PAC learning to bipartite matching, I will overview a particular transductive model of learning, and associated one-inclusion graphs, which can be viewed as a generalization of some of the hat puzzles that are popular in recreational mathematics. Whereas this transductive model is far from new, it has recently seen a resurgence of interest as a tool for tackling deep questions in learning theory. A secondary purpose of this article could be as a (biased) tutorial on the connections between the PAC and transductive models of learning.
