Table of Contents
Fetching ...

Block-Sample MAC-Bayes Generalization Bounds

Matthias Frey, Jingge Zhu, Michael C. Gastpar

TL;DR

A family of novel block-sample MAC-Bayes bounds that hold the promise of significantly improving upon the tightness of traditional PAC-Bayes and MAC-Bayes bounds is presented.

Abstract

We present a family of novel block-sample MAC-Bayes bounds (mean approximately correct). While PAC-Bayes bounds (probably approximately correct) typically give bounds for the generalization error that hold with high probability, MAC-Bayes bounds have a similar form but bound the expected generalization error instead. The family of bounds we propose can be understood as a generalization of an expectation version of known PAC-Bayes bounds. Compared to standard PAC-Bayes bounds, the new bounds contain divergence terms that only depend on subsets (or \emph{blocks}) of the training data. The proposed MAC-Bayes bounds hold the promise of significantly improving upon the tightness of traditional PAC-Bayes and MAC-Bayes bounds. This is illustrated with a simple numerical example in which the original PAC-Bayes bound is vacuous regardless of the choice of prior, while the proposed family of bounds are finite for appropriate choices of the block size. We also explore the question whether high-probability versions of our MAC-Bayes bounds (i.e., PAC-Bayes bounds of a similar form) are possible. We answer this question in the negative with an example that shows that in general, it is not possible to establish a PAC-Bayes bound which (a) vanishes with a rate faster than $\mathcal{O}(1/\log n)$ whenever the proposed MAC-Bayes bound vanishes with rate $\mathcal{O}(n^{-1/2})$ and (b) exhibits a logarithmic dependence on the permitted error probability.

Block-Sample MAC-Bayes Generalization Bounds

TL;DR

A family of novel block-sample MAC-Bayes bounds that hold the promise of significantly improving upon the tightness of traditional PAC-Bayes and MAC-Bayes bounds is presented.

Abstract

We present a family of novel block-sample MAC-Bayes bounds (mean approximately correct). While PAC-Bayes bounds (probably approximately correct) typically give bounds for the generalization error that hold with high probability, MAC-Bayes bounds have a similar form but bound the expected generalization error instead. The family of bounds we propose can be understood as a generalization of an expectation version of known PAC-Bayes bounds. Compared to standard PAC-Bayes bounds, the new bounds contain divergence terms that only depend on subsets (or \emph{blocks}) of the training data. The proposed MAC-Bayes bounds hold the promise of significantly improving upon the tightness of traditional PAC-Bayes and MAC-Bayes bounds. This is illustrated with a simple numerical example in which the original PAC-Bayes bound is vacuous regardless of the choice of prior, while the proposed family of bounds are finite for appropriate choices of the block size. We also explore the question whether high-probability versions of our MAC-Bayes bounds (i.e., PAC-Bayes bounds of a similar form) are possible. We answer this question in the negative with an example that shows that in general, it is not possible to establish a PAC-Bayes bound which (a) vanishes with a rate faster than whenever the proposed MAC-Bayes bound vanishes with rate and (b) exhibits a logarithmic dependence on the permitted error probability.
Paper Structure (19 sections, 6 theorems, 63 equations, 1 figure, 1 table)

This paper contains 19 sections, 6 theorems, 63 equations, 1 figure, 1 table.

Key Result

Theorem 1

Let $m \in [n] := \{1, \dots, n\}$, assume that $n$ is an integer multiple of $m$, and let $S'=(Z'_1,\ldots, Z'_{m})$ where $Z'_1\ldots, Z'_m$ are i.i.d. drawn from $P_Z$. Assume that for $\lambda'\in (0,b)$ and some distribution $Q_W$ over $\mathcal{W}$, it holds that for some function $\Phi_m: (0,b) \rightarrow (0,\infty)$. Then for any $\lambda \in (0,bn/m)$, it holds that where $J := n/m$.

Figures (1)

  • Figure 1: Comparison of true generalization error and theoretical bounds for the example in Section \ref{['sec:gaussian-truncated-loss']} with $\mu=1/2$. The solid blue curve shows the optimal bound \ref{['eq:example-catoni-bound']} with optimal choice for $m$.

Theorems & Definitions (14)

  • Theorem 1: Block-sample MAC-Bayes bounds
  • proof
  • Remark 1
  • Remark 2: Data dependent priors
  • Corollary 1
  • Remark 3
  • Corollary 2
  • Remark 4
  • Theorem 2
  • Remark 5
  • ...and 4 more