Leveraging PAC-Bayes Theory and Gibbs Distributions for Generalization Bounds with Complexity Measures
Paul Viallard, Rémi Emonet, Amaury Habrard, Emilie Morvant, Valentina Zantedeschi
TL;DR
This work tackles the challenge of obtaining computable generalization bounds that flexibly incorporate arbitrary complexity measures. It introduces a disintegrated PAC-Bayes framework where a user-defined parametric function $\mu(h,\mathcal{S})$ defines a Gibbs posterior $\rho_{\mathcal{S}}(h) \propto \exp[-\mu(h,\mathcal{S})]$, enabling bounds on the generalization gap $\phi(R^{\ell}_{\mathcal{D}}(h), R^{\ell}_{\mathcal{S}}(h))$ that adapt to task- and model-specific complexity. The paper provides a general bound (Theorem) and two practical corollaries for uniform and informed priors, along with extensive experiments on MNIST/FashionMNIST that show learned complexity measures (including Gap and neural predictors) can yield tight bounds even without data-dependent priors. This framework offers a principled path to integrate diverse, data- and model-dependent complexity notions into generalization analysis and model selection for deep learning.
Abstract
In statistical learning theory, a generalization bound usually involves a complexity measure imposed by the considered theoretical framework. This limits the scope of such bounds, as other forms of capacity measures or regularizations are used in algorithms. In this paper, we leverage the framework of disintegrated PAC-Bayes bounds to derive a general generalization bound instantiable with arbitrary complexity measures. One trick to prove such a result involves considering a commonly used family of distributions: the Gibbs distributions. Our bound stands in probability jointly over the hypothesis and the learning sample, which allows the complexity to be adapted to the generalization gap as it can be customized to fit both the hypothesis class and the task.
