Transductive Generalization via Optimal Transport and Its Application to Graph Node Classification

MoonJeong Park; Seungbeom Lee; Kyungmin Kim; Jaeseung Heo; Seunghyuk Cho; Shouheng Li; Sangdon Park; Dongwoo Kim

Transductive Generalization via Optimal Transport and Its Application to Graph Node Classification

MoonJeong Park, Seungbeom Lee, Kyungmin Kim, Jaeseung Heo, Seunghyuk Cho, Shouheng Li, Sangdon Park, Dongwoo Kim

TL;DR

This work establishes new representation-based generalization bounds in a distribution-free transductive setting, where learned representations are dependent, and test features are accessible during training, and derives global and class-wise bounds via optimal transport through Wasserstein distances between encoded feature distributions.

Abstract

Many existing transductive bounds rely on classical complexity measures that are computationally intractable and often misaligned with empirical behavior. In this work, we establish new representation-based generalization bounds in a distribution-free transductive setting, where learned representations are dependent, and test features are accessible during training. We derive global and class-wise bounds via optimal transport, expressed in terms of Wasserstein distances between encoded feature distributions. We demonstrate that our bounds are efficiently computable and strongly correlate with empirical generalization in graph node classification, improving upon classical complexity measures. Additionally, our analysis reveals how the GNN aggregation process transforms the representation distributions, inducing a trade-off between intra-class concentration and inter-class separation. This yields depth-dependent characterizations that capture the non-monotonic relationship between depth and generalization error observed in practice. The code is available at https://github.com/ml-postech/Transductive-OT-Gen-Bound.

Transductive Generalization via Optimal Transport and Its Application to Graph Node Classification

TL;DR

Abstract

Paper Structure (39 sections, 9 theorems, 81 equations, 4 figures, 3 tables)

This paper contains 39 sections, 9 theorems, 81 equations, 4 figures, 3 tables.

Introduction
Related work
Representation-based generalization bounds
Transductive generalization bounds
GNN-specific transductive analyses
Preliminaries
Generalization bound in transductive learning
Graph neural networks
Wasserstein distance
Wasserstein bounds in transductive learning
Setup
Theoretical analysis
Experiments
Datasets and experimental setup
Datasets and models
...and 24 more sections

Key Result

Theorem 4.1

Let $\gamma>0$. For any random split $\pi$, and all $f\circ\phi\in F\circ\Phi$, where for $i \in {\mathcal{I}}_{\mathrm{train}}^{(\pi)}, ~ j \in {\mathcal{I}}_{\mathrm{test}}^{(\pi)}, ~\text{and}~ y \in {\mathcal{Y}}.$

Figures (4)

Figure 1: Rank scatter plots of the empirical generalization error against (a) the PAC bound and (b) our proposed bound for SGC on the Squirrel dataset. The PAC bound shows weak rank correlation with the empirical generalization error, whereas our bound exhibits a stronger positive rank correlation.
Figure 2: Rank correlation between generalization bounds and empirical error gap across nine datasets and four GNN architectures. Global reports our bound from \ref{['thm:global-ot']}. Class-wise and Class-wise approx correspond to \ref{['thm:classwise-ot']} with and without test labels, respectively. Darker blue indicates a stronger positive correlation. Our bounds consistently achieve high correlations, while PAC and RC bounds show weak or negative correlations in most cases. N/A indicates the bound cannot be computed.
Figure 3: Depth analysis on SGC (top) and GCN (bottom) with Cora dataset.
Figure 4: Rank correlation between generalization bounds and empirical error gap across nine datasets and GraphSAGE. Darker blue indicates stronger positive correlation. N/A indicates the bound cannot be computed.

Theorems & Definitions (15)

Theorem 4.1: Global bound in the transductive setting
Theorem 4.2: Class-wise bound in the transductive setting
Remark : cf. Lemma 10 in chuang2021measuring; Proposition 5.2 in li2025towards
Proposition 6.1
Proposition 6.2
Theorem 1.1: Global bound in the transductive setting
proof
Theorem 1.1: Class-wise bound in the transductive setting
proof
Definition 1.1: $(m,u)$-permutation symmetry el2009transductive
...and 5 more

Transductive Generalization via Optimal Transport and Its Application to Graph Node Classification

TL;DR

Abstract

Transductive Generalization via Optimal Transport and Its Application to Graph Node Classification

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (15)