Implicit Graph Neural Diffusion Networks: Convergence, Generalization, and Over-Smoothing
Guoji Fu, Mohammed Haroon Dupty, Yanfei Dong, Lee Wee Sun
TL;DR
The paper tackles over-smoothing and unreliability in implicit GNNs by proposing a geometric framework that learns vertex and edge metrics through a parameterized graph Laplacian $\Delta_\Phi$. By casting diffusion as the fixed-point solution of a Dirichlet-energy minimization with feature-constrained nodes, it introduces DIGNN, which avoids OST and OSI and achieves convergence when $\mu > \lambda_{\max}(\Delta_\Phi)$. The authors derive transductive generalization bounds that depend on the ratio $\lambda_\Phi/\mu$, and validate the theory with experiments showing state-of-the-art results on heterophilic node classification and strong performance on graph classification. Overall, the work provides both theoretical guarantees and practical improvements by learning graph metrics within implicit diffusion layers, with implications for reliable and scalable graph learning.
Abstract
Implicit Graph Neural Networks (GNNs) have achieved significant success in addressing graph learning problems recently. However, poorly designed implicit GNN layers may have limited adaptability to learn graph metrics, experience over-smoothing issues, or exhibit suboptimal convergence and generalization properties, potentially hindering their practical performance. To tackle these issues, we introduce a geometric framework for designing implicit graph diffusion layers based on a parameterized graph Laplacian operator. Our framework allows learning the metrics of vertex and edge spaces, as well as the graph diffusion strength from data. We show how implicit GNN layers can be viewed as the fixed-point equation of a Dirichlet energy minimization problem and give conditions under which it may suffer from over-smoothing during training (OST) and inference (OSI). We further propose a new implicit GNN model to avoid OST and OSI. We establish that with an appropriately chosen hyperparameter greater than the largest eigenvalue of the parameterized graph Laplacian, DIGNN guarantees a unique equilibrium, quick convergence, and strong generalization bounds. Our models demonstrate better performance than most implicit and explicit GNN baselines on benchmark datasets for both node and graph classification tasks.
