Information-Theoretic Generalization Bounds for Transductive Learning and its Applications

Huayi Tang; Yong Liu

Information-Theoretic Generalization Bounds for Transductive Learning and its Applications

Huayi Tang, Yong Liu

TL;DR

The paper develops a comprehensive information-theoretic and PAC-Bayesian framework for transductive learning, addressing both random splitting and random sampling, and introduces transductive supersamples to bridge inductive results to transduction. It derives mutual information and conditional mutual information bounds, along with transductive PAC-Bayes bounds, under relaxed loss and data-count assumptions, and extends to adaptive optimization scenarios. The results are applied to semi-supervised learning and transductive graph learning, with empirical validation showing non-vacuous bounds that track the generalization gap as labeled data increases. Together, these contributions provide principled, data- and algorithm-dependent generalization guarantees for transductive models, including GNNs, in practical setups. The work has implications for understanding generalization in complex, label-efficient learning settings and informs the design of transductive strategies in real-world systems.

Abstract

In this paper, we establish generalization bounds for transductive learning algorithms in the context of information theory and PAC-Bayes, covering both the random sampling and the random splitting setting. First, we show that the transductive generalization gap can be controlled by the mutual information between training label selection and the hypothesis. Next, we propose the concept of transductive supersample and use it to derive transductive information-theoretic bounds involving conditional mutual information and different information measures. We further establish transductive PAC-Bayesian bounds with weaker assumptions on the type of loss function and the number of training and test data points. Lastly, we use the theoretical results to derive upper bounds for adaptive optimization algorithms under the transductive learning setting. We also apply them to semi-supervised learning and transductive graph learning scenarios, meanwhile validating the derived bounds by experiments on synthetic and real-world datasets.

Information-Theoretic Generalization Bounds for Transductive Learning and its Applications

TL;DR

Abstract

Paper Structure (37 sections, 18 theorems, 188 equations, 3 figures, 1 table)

This paper contains 37 sections, 18 theorems, 188 equations, 3 figures, 1 table.

Introduction
Related Work
Information-theoretic Generalization Theory
PAC-Bayesian Generalization Theory
Generalization Theory of Transductive Learning
Preliminaries
Notations
Random Splitting Setting for Transductive Learning
Random Sampling Setting for Transductive Learning
Main Results
Mutual Information Bounds for Transductive Learning
Conditional Mutual Information Bounds for Transductive Learning
Transductive PAC-Bayesian Bounds under the Random Splitting Setting
Transductive PAC-Bayesian Bounds under the Random Sampling Setting
Upper Bounds for Adaptive Optimization Algorithms
...and 22 more sections

Key Result

Theorem 1

Suppose that $\ell(w,s) \in [0,B]$ holds for all $w \in \mathcal{W}, s \in \{s_i\}_{i=1}^n$, where $B > 0$ is a constant. Also, suppose that $P_{W,Z} \ll P_W P_Z$. Then we have

Figures (3)

Figure 1: Estimations of the transductive generalization gap and the derived bounds on MNIST and CIFAR-$10$ with different values of $m$ and $k$.
Figure 2: Estimations of the transductive generalization gap and the derived bounds on cSBMs with GAT and GPR-GNN. The first (second) and third (fourth) rows correspond to $\phi=-0.5$ ($\phi=0.5$). The left, middle, and right figures in each row correspond to $k=1$, $k=2$ and $k=3$.
Figure 3: Estimations of the transductive generalization gap and the derived bounds on real-world datasets with GAT and GPR-GNN.

Theorems & Definitions (22)

Theorem 1
Theorem 2
Theorem 3: Yaniv2007, Theorem 1
Proposition 4
Definition 5: Transductive Supersample
Proposition 6
Theorem 7
Corollary 8
Definition 9: $k$-Transductive Supersample
Theorem 10
...and 12 more

Information-Theoretic Generalization Bounds for Transductive Learning and its Applications

TL;DR

Abstract

Information-Theoretic Generalization Bounds for Transductive Learning and its Applications

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (22)