A Framework for Bounding Deterministic Risk with PAC-Bayes: Applications to Majority Votes
Benjamin Leblanc, Pascal Germain
TL;DR
This work addresses the limitation of classical PAC-Bayes bounds, which only guarantee stochastic risk for a distribution over hypotheses, by introducing Stochastic-to-Deterministic (S2D) bounds that yield deterministic risk guarantees for a single predictor. The core idea is to bound the deterministic risk ${ m L}_{ abla}(h)$ via a general oracle relation to the stochastic risk ${ m E}_{h'\sim Q}{\rm L}_{\nabla}(h')$ and conditional terms $b_{\nabla}^Q(h)$ and $c_{\nabla}^Q(h)$, then specialize this framework to majority votes under Categorical, Dirichlet, and Gaussian posteriors. The partition bound provides a computable approach to deterministically bound the risk by solving a partition problem to tighten the lower bound on $c_{\nabla}^Q(h)$; empirical results on binary and multi-class tasks show that the proposed bound is at least as tight as, and often significantly tighter than, existing baselines. The framework is general, consistent with existing PAC-Bayes methods, and paves the way for extending deterministic guarantees to more complex models like neural networks, as well as exploring differentiable approximations to the partition problem. Overall, S2D offers a practical route to obtaining reliable, deterministic generalization guarantees from PAC-Bayes principles.
Abstract
PAC-Bayes is a popular and efficient framework for obtaining generalization guarantees in situations involving uncountable hypothesis spaces. Unfortunately, in its classical formulation, it only provides guarantees on the expected risk of a randomly sampled hypothesis. This requires stochastic predictions at test time, making PAC-Bayes unusable in many practical situations where a single deterministic hypothesis must be deployed. We propose a unified framework to extract guarantees holding for a single hypothesis from stochastic PAC-Bayesian guarantees. We present a general oracle bound and derive from it a numerical bound and a specialization to majority vote. We empirically show that our approach consistently outperforms popular baselines (by up to a factor of 2) when it comes to generalization bounds on deterministic classifiers.
