Tighter Information-Theoretic Generalization Bounds via a Novel Class of Change of Measure Inequalities

Yanxiao Liu; Yijun Fan; Deniz Gündüz

Tighter Information-Theoretic Generalization Bounds via a Novel Class of Change of Measure Inequalities

Yanxiao Liu, Yijun Fan, Deniz Gündüz

TL;DR

This work addresses the problem of providing tighter high-probability generalization bounds for stochastic learning algorithms by introducing a novel class of change-of-measure inequalities derived from the data processing inequality for $f$-divergences. By unifying a broad set of information measures—including $f$-divergences (KL, $\chi^2$), Rényi divergence, and Sibson $\alpha$-mutual information (maximal leakage as a special case)—the authors obtain flexible, tighter bounds that apply across PAC-Bayesian theory, conditional mutual information, and differential privacy contexts. The proposed DPI-based framework yields novel bounds and recovers several known results with simpler analyses, while often outperforming existing bounds in key regimes. This approach provides a versatile toolkit for provable generalization guarantees across privacy-preserving, stability-based, and Bayesian learning settings, with potential applicability to deep learning generalization analyses as well.

Abstract

In this paper, we propose a novel class of change of measure inequalities via a unified framework based on the data processing inequality for $f$-divergences, which is surprisingly elementary yet powerful enough to yield tighter inequalities. We provide change of measure inequalities in terms of a broad family of information measures, including $f$-divergences (with Kullback-Leibler divergence and $χ^2$-divergence as special cases), Rényi divergence, and $α$-mutual information (with maximal leakage as a special case). We then embed these inequalities into the analysis of generalization error for stochastic learning algorithms, yielding novel and tighter high-probability information-theoretic generalization bounds, while also recovering several best-known results via simplified analyses. A key advantage of our framework is its flexibility: it readily adapts to a range of settings, including the conditional mutual information framework, PAC-Bayesian theory, and differential privacy mechanisms, for which we derive new generalization bounds.

Tighter Information-Theoretic Generalization Bounds via a Novel Class of Change of Measure Inequalities

TL;DR

-divergences. By unifying a broad set of information measures—including

-divergences (KL,

), Rényi divergence, and Sibson

-mutual information (maximal leakage as a special case)—the authors obtain flexible, tighter bounds that apply across PAC-Bayesian theory, conditional mutual information, and differential privacy contexts. The proposed DPI-based framework yields novel bounds and recovers several known results with simpler analyses, while often outperforming existing bounds in key regimes. This approach provides a versatile toolkit for provable generalization guarantees across privacy-preserving, stability-based, and Bayesian learning settings, with potential applicability to deep learning generalization analyses as well.

Abstract

In this paper, we propose a novel class of change of measure inequalities via a unified framework based on the data processing inequality for

-divergences, which is surprisingly elementary yet powerful enough to yield tighter inequalities. We provide change of measure inequalities in terms of a broad family of information measures, including

-divergences (with Kullback-Leibler divergence and

-divergence as special cases), Rényi divergence, and

-mutual information (with maximal leakage as a special case). We then embed these inequalities into the analysis of generalization error for stochastic learning algorithms, yielding novel and tighter high-probability information-theoretic generalization bounds, while also recovering several best-known results via simplified analyses. A key advantage of our framework is its flexibility: it readily adapts to a range of settings, including the conditional mutual information framework, PAC-Bayesian theory, and differential privacy mechanisms, for which we derive new generalization bounds.

Paper Structure (49 sections, 20 theorems, 207 equations, 1 figure, 1 table)

This paper contains 49 sections, 20 theorems, 207 equations, 1 figure, 1 table.

Introduction
Related Work
Information Measures
Change of Measure Inequalities
Generalization Error Bounds
Generalization Error Bounds via $f$-Divergence
Generalization Error Bounds via Maximal Leakage and $\alpha$-Mutual Information
PAC-Bayesian Bounds
Generalization Error Bounds via Conditional Mutual Information
Generalization and Approximate Differential Privacy
Future Directions
More on Related Work
Change of Measure Inequalities.
Generalization Error Bounds.
Information Measures.
...and 34 more sections

Key Result

Proposition 1

Fix probability measures $P, Q$ on $\mathcal{X}$ such that $P\ll Q$. For all measurable $E$,

Figures (1)

Figure 1: Comparison between Corollary \ref{['cor::gen_bd_MI']} and chu2023unified.

Theorems & Definitions (35)

Definition 1
Definition 2
Definition 3
Proposition 1
proof
Theorem 2
Theorem 3
Theorem 4
Theorem 5
Corollary 6
...and 25 more

Tighter Information-Theoretic Generalization Bounds via a Novel Class of Change of Measure Inequalities

TL;DR

Abstract

Tighter Information-Theoretic Generalization Bounds via a Novel Class of Change of Measure Inequalities

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (35)