Misclassification bounds for PAC-Bayesian sparse deep learning
The Tien Mai
TL;DR
The paper develops a PAC-Bayesian analysis for sparse deep classifiers using Spike-and-Slab priors and hinge-loss based risk, yielding non-asymptotic misclassification bounds via an EWA (Gibbs) posterior. It proves slow- and fast-rate oracle inequalities and shows minimax-optimal rates in both low- and high-dimensional settings up to logarithmic factors, with explicit architectural regimes. An automatic architecture selection procedure is proposed, achieving adaptivity and guaranteeing near-optimal rates by balancing expected hinge risk and posterior–prior complexity. The results bridge Bayesian DNN theory with classical minimax theory, providing practical generalization guarantees and principled model-architecture selection for sparse networks.
Abstract
Recently, there has been a significant focus on exploring the theoretical aspects of deep learning, especially regarding its performance in classification tasks. Bayesian deep learning has emerged as a unified probabilistic framework, seeking to integrate deep learning with Bayesian methodologies seamlessly. However, there exists a gap in the theoretical understanding of Bayesian approaches in deep learning for classification. This study presents an attempt to bridge that gap. By leveraging PAC-Bayes bounds techniques, we present theoretical results on the prediction or misclassification error of a probabilistic approach utilizing Spike-and-Slab priors for sparse deep learning in classification. We establish non-asymptotic results for the prediction error. Additionally, we demonstrate that, by considering different architectures, our results can achieve minimax optimal rates in both low and high-dimensional settings, up to a logarithmic factor. Moreover, our additional logarithmic term yields slight improvements over previous works. Additionally, we propose and analyze an automated model selection approach aimed at optimally choosing a network architecture with guaranteed optimality.
