Misclassification excess risk bounds for PAC-Bayesian classification via convexified loss

The Tien Mai

Misclassification excess risk bounds for PAC-Bayesian classification via convexified loss

The Tien Mai

TL;DR

The paper addresses the problem of bounding misclassification excess risk for PAC-Bayesian classification when using a convex surrogate loss. It develops a PAC-Bayes relative bound in expectation under a low-noise/margin condition, showing that with the Gibbs posterior and a calibrated temperature, the misclassification risk can be controlled by a combination of the convex-loss excess and a KL-divergence penalty, yielding fast n^{-1} rates. The framework is illustrated through two nontrivial applications: high-dimensional sparse classification with a sparsity-promoting prior achieving s^* log(d/s^*)/n bounds, and 1-bit matrix completion with a low-rank prior achieving r(d_1+d_2)/n bounds (up to log factors), both minimax-optimal. These results extend PAC-Bayesian theory beyond risk bounds for convexified losses to direct misclassification risk, providing practical guarantees for convex PAC-Bayesian methods in classification tasks with large or structured parameter spaces.

Abstract

PAC-Bayesian bounds have proven to be a valuable tool for deriving generalization bounds and for designing new learning algorithms in machine learning. However, it typically focus on providing generalization bounds with respect to a chosen loss function. In classification tasks, due to the non-convex nature of the 0-1 loss, a convex surrogate loss is often used, and thus current PAC-Bayesian bounds are primarily specified for this convex surrogate. This work shifts its focus to providing misclassification excess risk bounds for PAC-Bayesian classification when using a convex surrogate loss. Our key ingredient here is to leverage PAC-Bayesian relative bounds in expectation rather than relying on PAC-Bayesian bounds in probability. We demonstrate our approach in several important applications.

Misclassification excess risk bounds for PAC-Bayesian classification via convexified loss

TL;DR

Abstract

Paper Structure (13 sections, 7 theorems, 54 equations)

This paper contains 13 sections, 7 theorems, 54 equations.

Introduction and motivation
Main result
PAC-Bayesian framework
Main result
Assumptions
Main results
Application
High dimensional sparse classifcation
1-bit matrix completion
Concluding discussions
Proofs
Proof of Section \ref{['sc_problem_method']}
Proof of Section \ref{['sc_application']}

Key Result

Theorem 1

Assuming that Assumptions assume_boundedloss, assume_Lipschitz and dfnbernstein are satisfied, let's take $\lambda = n/\overline{C}$. Then we have:

Theorems & Definitions (22)

Remark 1
Theorem 1
Remark 2
Theorem 2
Remark 3
Corollary 1
Remark 4
Example 1: Finite case
Example 2
Theorem 3
...and 12 more

Misclassification excess risk bounds for PAC-Bayesian classification via convexified loss

TL;DR

Abstract

Misclassification excess risk bounds for PAC-Bayesian classification via convexified loss

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (22)