Table of Contents
Fetching ...

Bound to Disagree: Generalization Bounds via Certifiable Surrogates

Mathieu Bazinet, Valentina Zantedeschi, Pascal Germain

TL;DR

This paper provides new disagreement-based certificates for the gap between the true risk of any two predictors, and bound the true risk of the predictor of interest via a surrogate model that enjoys tight generalization guarantees, and evaluates the disagreement bound on an unlabeled dataset.

Abstract

Generalization bounds for deep learning models are typically vacuous, not computable or restricted to specific model classes. In this paper, we tackle these issues by providing new disagreement-based certificates for the gap between the true risk of any two predictors. We then bound the true risk of the predictor of interest via a surrogate model that enjoys tight generalization guarantees, and evaluating our disagreement bound on an unlabeled dataset. We empirically demonstrate the tightness of the obtained certificates and showcase the versatility of the approach by training surrogate models leveraging three different frameworks: sample compression, model compression and PAC-Bayes theory. Importantly, such guarantees are achieved without modifying the target model, nor adapting the training procedure to the generalization framework.

Bound to Disagree: Generalization Bounds via Certifiable Surrogates

TL;DR

This paper provides new disagreement-based certificates for the gap between the true risk of any two predictors, and bound the true risk of the predictor of interest via a surrogate model that enjoys tight generalization guarantees, and evaluates the disagreement bound on an unlabeled dataset.

Abstract

Generalization bounds for deep learning models are typically vacuous, not computable or restricted to specific model classes. In this paper, we tackle these issues by providing new disagreement-based certificates for the gap between the true risk of any two predictors. We then bound the true risk of the predictor of interest via a surrogate model that enjoys tight generalization guarantees, and evaluating our disagreement bound on an unlabeled dataset. We empirically demonstrate the tightness of the obtained certificates and showcase the versatility of the approach by training surrogate models leveraging three different frameworks: sample compression, model compression and PAC-Bayes theory. Importantly, such guarantees are achieved without modifying the target model, nor adapting the training procedure to the generalization framework.
Paper Structure (35 sections, 27 theorems, 68 equations, 6 figures, 10 tables)

This paper contains 35 sections, 27 theorems, 68 equations, 6 figures, 10 tables.

Key Result

Theorem 1

For a distribution $\mathop{\mathrm{\mathcal{D}}}\nolimits$ over $\mathop{\mathrm{\mathcal{X}}}\nolimits \times \mathop{\mathrm{\mathcal{Y}}}\nolimits$, a loss $\ell: \mathop{\mathrm{\mathbb{R}}}\nolimits^C \times \mathop{\mathrm{\mathcal{Y}}}\nolimits \to [B_{\ell}, T_{\ell}]$ and $\delta \in (0,1] with $P_n(\mathop{\mathrm{|\mathbf{i}|}}\nolimits) = \tfrac{6}{\pi^2}(\mathop{\mathrm{|\mathbf{i}|}

Figures (6)

  • Figure 1: Generalization bounds. Comparison between bounds from the literature (Norm-based and Partition-based) and our new disagreement-based bounds, using surrogates from sample compression (SC), model compression (MC) and PAC-Bayes (PB) theory.
  • Figure 2: Illustration of the behavior of our disagreement bounds on MNIST using sample-compression methods.
  • Figure 3: Behavior of the disagreement bound according to the number of bits and the use of quantization-aware training.
  • Figure S1: Comparison of the different approaches to compute the partition-based bounds. We consider 5 different seeds for the random clusters.
  • Figure S2: Illustration of the behavior of our disagreement bounds on CIFAR10 using sample-compression methods.
  • ...and 1 more figures

Theorems & Definitions (41)

  • Theorem 1: bazinet2024
  • Theorem 2: lotfi2023non
  • Theorem 3: mcallester2003pac
  • Theorem 4
  • Theorem 5
  • Theorem 6
  • Definition S1: perez2021tighter
  • Definition S2: lotfi2023non
  • Definition S3
  • Theorem S4: Partition-based bound of than2025non
  • ...and 31 more