Posterior Label Smoothing for Node Classification

Jaeseung Heo; Moonjeong Park; Dongwoo Kim

Posterior Label Smoothing for Node Classification

Jaeseung Heo, Moonjeong Park, Dongwoo Kim

TL;DR

PosteL tackles node classification on graphs with varying levels of homophily and heterophily by deriving soft labels from a posterior over node labels conditioned on neighbor labels. The posterior is computed using a likelihood based on the product of neighborhood label conditionals and a prior from global label frequencies, with iterative pseudo-labeling refining these statistics across training rounds. Across 10 datasets and 8 backbone models, PosteL yields robust improvements, notably strong gains on heterophilic graphs and a substantial 14.43% boost on Cornell with GCN, while maintaining competitive performance on homophilic graphs. The approach offers a practical, scalable soft-label regularization that mitigates overfitting and enhances generalization, with code available for replication.

Abstract

Label smoothing is a widely studied regularization technique in machine learning. However, its potential for node classification in graph-structured data, spanning homophilic to heterophilic graphs, remains largely unexplored. We introduce posterior label smoothing, a novel method for transductive node classification that derives soft labels from a posterior distribution conditioned on neighborhood labels. The likelihood and prior distributions are estimated from the global statistics of the graph structure, allowing our approach to adapt naturally to various graph properties. We evaluate our method on 10 benchmark datasets using eight baseline models, demonstrating consistent improvements in classification accuracy. The following analysis demonstrates that soft labels mitigate overfitting during training, leading to better generalization performance, and that pseudo-labeling effectively refines the global label statistics of the graph. Our code is available at https://github.com/ml-postech/PosteL.

Posterior Label Smoothing for Node Classification

TL;DR

Abstract

Paper Structure (41 sections, 8 theorems, 36 equations, 11 figures, 10 tables, 2 algorithms)

This paper contains 41 sections, 8 theorems, 36 equations, 11 figures, 10 tables, 2 algorithms.

Introduction
Related Work
Node Classification
Classification with Soft Labels
Method
Posterior Label Smoothing
Iterative Pseudo-labeling
Theoretical Analysis of PosteL
Experiments
Node Classification
Datasets
Experimental Setup and Baselines
Results
Empirical Analysis
Empirical Validation of the Conditional Independence in \ref{['eqn:factorization']}
...and 26 more sections

Key Result

Lemma 1

Suppose that the classes are balanced, i.e., $P(\hat{Y} = 0) = P(\hat{Y} = 1)$ and the graph is homophilic, i.e., $c_k > 1 - c_{1-k}$. Then, for any node $i$ with neighbors $\mathcal{N}(i)$, the posterior probability satisfies, if and only if

Figures (11)

Figure 1: Overall illustration of posterior label smoothing. To relabel the node label, we compute the posterior distribution of the label given neighborhood labels. The likelihood and prior distributions are estimated from global statistics. The statistics are updated through the pseudo-labels after training, resulting in an iterative algorithm.
Figure 2: Toy example illustrating the difference between PosteL and SALS wang2021structure. The leftmost column shows three examples of a target node (represented as T) with different local neighborhood structures. The second and third columns show how SALS and PosteL create soft labels with homophilic and heterophilic graphs, respectively.
Figure 3: Estimated likelihood via product of marginals $P(Y_j|Y_i=0,j\in\mathcal{N}(i))\times P(Y_k|Y_i=0,k\in\mathcal{N}(i))$ and empirical joint distribution $P(Y_j, Y_k|Y_i=0,j,k\in\mathcal{N}(i))$.
Figure 4: Loss curve comparisons: (a) using ground-truth (GT) labels versus PosteL labels on the Squirrel dataset; (b) across iterations of iterative pseudo-labeling on the Cornell dataset.
Figure 5: Estimated conditional distributions obtained from (a) training labels only, (b) training labels combined with pseudo-labels, and (c) all ground-truth labels.
...and 6 more figures

Theorems & Definitions (11)

Lemma 1: Homophilic graph
Lemma 2: Heterophilic graph
Lemma 3: Same degree
Lemma 4: Different degree
Lemma 5: Homophilic graph
Lemma 6: Heterophilic graph
proof
Lemma 7: Same degree
proof
Lemma 8: Different degree
...and 1 more

Posterior Label Smoothing for Node Classification

TL;DR

Abstract

Posterior Label Smoothing for Node Classification

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (11)