Understanding Oversmoothing in Diffusion-Based GNNs From the Perspective of Operator Semigroup Theory
Weichen Zhao, Chenguang Wang, Xinyan Wang, Congying Han, Tiande Guo, Tianshu Yu
TL;DR
This work provides a unified, mathematically grounded framework to analyze oversmoothing in diffusion-based GNNs using operator semigroup theory. It shows that oversmoothing arises from the ergodicity of the diffusion generator and that the fixed-point behavior is determined by an invariant measure, with the convergence rate tied to the spectral gap. By introducing an ergodicity-breaking term, the approach generalizes existing diffusion remedies and offers a principled design principle to mitigate oversmoothing, complemented by a probabilistic interpretation via Markov processes and a killing process. Empirical results on node classification across diverse datasets demonstrate reduced oversmoothing (higher Dirichlet energy) and improved performance, validating the theoretical claims and highlighting the method's practical impact.
Abstract
This paper presents an analytical study of the oversmoothing issue in diffusion-based Graph Neural Networks (GNNs). Generalizing beyond extant approaches grounded in random walk analysis or particle systems, we approach this problem through operator semigroup theory. This theoretical framework allows us to rigorously prove that oversmoothing is intrinsically linked to the ergodicity of the diffusion operator. Relying on semigroup method, we can quantitatively analyze the dynamic of graph diffusion and give a specific mathematical form of the smoothing feature by ergodicity and invariant measure of operator, which improves previous works only show existence of oversmoothing. This finding further poses a general and mild ergodicity-breaking condition, encompassing the various specific solutions previously offered, thereby presenting a more universal and theoretically grounded approach to relieve oversmoothing in diffusion-based GNNs. Additionally, we offer a probabilistic interpretation of our theory, forging a link with prior works and broadening the theoretical horizon. Our experimental results reveal that this ergodicity-breaking term effectively mitigates oversmoothing measured by Dirichlet energy, and simultaneously enhances performance in node classification tasks.
