Not All Neighbors Are Worth Attending to: Graph Selective Attention Networks for Semi-supervised Learning
Tiantian He, Haicang Zhou, Yew-Soon Ong, Gao Cong
TL;DR
This work tackles the oversimplified assumption in graph attention networks that all neighbors should contribute equally to a node's representation. It introduces Selective Attention (SA), enabling per-node control over the attention scope via node-node dissimilarity, and constructs Graph Selective Attention Networks (SATs) that preferentially aggregate information from highly relevant neighbors. The authors provide theoretical analysis showing SA-based layers can achieve the expressivity upper bound of 1-WL when combined with an enhanced attention aggregation, and demonstrate strong empirical gains on multiple real-world datasets for semi-supervised classification and clustering. The results suggest that ignoring irrelevant neighbors yields richer representations, with SATs offering improved performance at the cost of modestly higher parameter and memory requirements. This approach has practical implications for scalable, accurate graph learning in domains with complex, heterogeneous graph structures.
Abstract
Graph attention networks (GATs) are powerful tools for analyzing graph data from various real-world scenarios. To learn representations for downstream tasks, GATs generally attend to all neighbors of the central node when aggregating the features. In this paper, we show that a large portion of the neighbors are irrelevant to the central nodes in many real-world graphs, and can be excluded from neighbor aggregation. Taking the cue, we present Selective Attention (SA) and a series of novel attention mechanisms for graph neural networks (GNNs). SA leverages diverse forms of learnable node-node dissimilarity to acquire the scope of attention for each node, from which irrelevant neighbors are excluded. We further propose Graph selective attention networks (SATs) to learn representations from the highly correlated node features identified and investigated by different SA mechanisms. Lastly, theoretical analysis on the expressive power of the proposed SATs and a comprehensive empirical study of the SATs on challenging real-world datasets against state-of-the-art GNNs are presented to demonstrate the effectiveness of SATs.
