Channel-Attentive Graph Neural Networks
Tuğrul Hasan Karabulut, İnci M. Baytaş
TL;DR
This paper tackles the over-smoothing challenge in deep graph neural networks by introducing CHAT-GNN, a channel-wise attention mechanism that enables adaptive message-passing across feature channels. It defines a channel-attentive message function $\text{MSG}(\mathbf{h}_v, \mathbf{h}_w) = \beta(\mathbf{h}_v, \mathbf{h}_w) \odot \mathbf{h}_w$ with $\beta(\mathbf{h}_v, \mathbf{h}_w) = \tanh(\mathbf{W}_1 \mathbf{h}_v + \mathbf{W}_2 \mathbf{h}_w)$ and a combine phase that uses separate linear projections, yielding a full CHAT-GNN architecture trained for node classification. The authors provide theoretical bounds relating local variation and message differences, and demonstrate via extensive experiments that CHAT-GNN reduces Dirichlet energy decay and delivers state-of-the-art performance on heterophilous graphs while remaining competitive on homophilous ones. Visual analyses show learned channel weights adapt to neighbors and hops, supporting the claim of flexible, edge- and hop-aware information flow. Overall, CHAT-GNN offers a scalable, principled approach to mitigating over-smoothing and improving generalization in diverse graph domains.
Abstract
Graph Neural Networks (GNNs) set the state-of-the-art in representation learning for graph-structured data. They are used in many domains, from online social networks to complex molecules. Most GNNs leverage the message-passing paradigm and achieve strong performances on various tasks. However, the message-passing mechanism used in most models suffers from over-smoothing as a GNN's depth increases. The over-smoothing degrades GNN's performance due to the increased similarity between the representations of unrelated nodes. This study proposes an adaptive channel-wise message-passing approach to alleviate the over-smoothing. The proposed model, Channel-Attentive GNN, learns how to attend to neighboring nodes and their feature channels. Thus, much diverse information can be transferred between nodes during message-passing. Experiments with widely used benchmark datasets show that the proposed model is more resistant to over-smoothing than baselines and achieves state-of-the-art performances for various graphs with strong heterophily. Our code is at https://github.com/ALLab-Boun/CHAT-GNN.
