Graph Neural Networks Need Cluster-Normalize-Activate Modules
Arseny Skryagin, Felix Divo, Mohammad Amin Ali, Devendra Singh Dhami, Kristian Kersting
TL;DR
Graph Neural Networks struggle with oversmoothing as depth increases, causing node representations to collapse. The authors introduce Cluster-Normalize-Activate (CNA), a plug-and-play module that clusters node features per layer, normalizes within clusters, and applies cluster-specific learnable activations to preserve diversity. They provide theoretical arguments showing CNA thwarts standard oversmoothing proofs and demonstrate strong empirical gains across node classification, node property prediction, and graph-level tasks, with fewer parameters than competing models. The results suggest CNA enables deeper, more expressive GNNs with practical efficiency gains for real-world graph tasks.
Abstract
Graph Neural Networks (GNNs) are non-Euclidean deep learning models for graph-structured data. Despite their successful and diverse applications, oversmoothing prohibits deep architectures due to node features converging to a single fixed point. This severely limits their potential to solve complex tasks. To counteract this tendency, we propose a plug-and-play module consisting of three steps: Cluster-Normalize-Activate (CNA). By applying CNA modules, GNNs search and form super nodes in each layer, which are normalized and activated individually. We demonstrate in node classification and property prediction tasks that CNA significantly improves the accuracy over the state-of-the-art. Particularly, CNA reaches 94.18% and 95.75% accuracy on Cora and CiteSeer, respectively. It further benefits GNNs in regression tasks as well, reducing the mean squared error compared to all baselines. At the same time, GNNs with CNA require substantially fewer learnable parameters than competing architectures.
