Misclassification bounds for PAC-Bayesian sparse deep learning

The Tien Mai

Misclassification bounds for PAC-Bayesian sparse deep learning

The Tien Mai

TL;DR

The paper develops a PAC-Bayesian analysis for sparse deep classifiers using Spike-and-Slab priors and hinge-loss based risk, yielding non-asymptotic misclassification bounds via an EWA (Gibbs) posterior. It proves slow- and fast-rate oracle inequalities and shows minimax-optimal rates in both low- and high-dimensional settings up to logarithmic factors, with explicit architectural regimes. An automatic architecture selection procedure is proposed, achieving adaptivity and guaranteeing near-optimal rates by balancing expected hinge risk and posterior–prior complexity. The results bridge Bayesian DNN theory with classical minimax theory, providing practical generalization guarantees and principled model-architecture selection for sparse networks.

Abstract

Recently, there has been a significant focus on exploring the theoretical aspects of deep learning, especially regarding its performance in classification tasks. Bayesian deep learning has emerged as a unified probabilistic framework, seeking to integrate deep learning with Bayesian methodologies seamlessly. However, there exists a gap in the theoretical understanding of Bayesian approaches in deep learning for classification. This study presents an attempt to bridge that gap. By leveraging PAC-Bayes bounds techniques, we present theoretical results on the prediction or misclassification error of a probabilistic approach utilizing Spike-and-Slab priors for sparse deep learning in classification. We establish non-asymptotic results for the prediction error. Additionally, we demonstrate that, by considering different architectures, our results can achieve minimax optimal rates in both low and high-dimensional settings, up to a logarithmic factor. Moreover, our additional logarithmic term yields slight improvements over previous works. Additionally, we propose and analyze an automated model selection approach aimed at optimally choosing a network architecture with guaranteed optimality.

Misclassification bounds for PAC-Bayesian sparse deep learning

TL;DR

Abstract

Misclassification bounds for PAC-Bayesian sparse deep learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (29)