Differentiable Cluster Graph Neural Network
Yanfei Dong, Mohammed Haroon Dupty, Lambert Deng, Zhuanghua Liu, Yong Liang Goh, Wee Sun Lee
TL;DR
DC-GNN tackles the dual challenges of long-range information propagation and heterophily by injecting a differentiable clustering inductive bias into GNN message passing. It achieves this by augmenting the original graph with a bipartite set of cluster-nodes (global and local) and optimizing an OT-based clustering objective \mathcal{L}_{\rm cluster}\u001f through a differentiable block-coordinate descent (DC-MsgPassing) that alternates between Sinkhorn-based assignment updates and closed-form embedding updates. The approach yields two key benefits: local cluster-nodes preserve graph structure and improve aggregation in heterophilous neighborhoods, while global cluster-nodes enable distant information transfer, collectively reducing oversquashing and improving performance on both heterophilous and homophilous datasets. Empirical results on 14 datasets demonstrate state-of-the-art accuracy across heterophilous graphs and strong results on homophilous graphs, with clear ablations showing the value of the clustering terms and regularizers for robust learning.
Abstract
Graph Neural Networks often struggle with long-range information propagation and in the presence of heterophilous neighborhoods. We address both challenges with a unified framework that incorporates a clustering inductive bias into the message passing mechanism, using additional cluster-nodes. Central to our approach is the formulation of an optimal transport based implicit clustering objective function. However, the algorithm for solving the implicit objective function needs to be differentiable to enable end-to-end learning of the GNN. To facilitate this, we adopt an entropy regularized objective function and propose an iterative optimization process, alternating between solving for the cluster assignments and updating the node/cluster-node embeddings. Notably, our derived closed-form optimization steps are themselves simple yet elegant message passing steps operating seamlessly on a bipartite graph of nodes and cluster-nodes. Our clustering-based approach can effectively capture both local and global information, demonstrated by extensive experiments on both heterophilous and homophilous datasets.
