BuffGraph: Enhancing Class-Imbalanced Node Classification via Buffer Nodes
Qian Wang, Zemin Liu, Zhen Zhang, Bingsheng He
TL;DR
This work tackles the challenge of class-imbalanced node classification in graphs by addressing heterophily through buffer nodes that interpose along every edge, slowing and modulating information flow to balance minority and majority signals. BuffGraph introduces buffer-node generation via mixup, an edge-heterophily guided dynamic message passing framework with a joint loss L_{total} = L_{pred} + λ L_{hetero}, and a spectral perspective showing how buffering reshapes diffusion via the graph Laplacian. Empirically, BuffGraph consistently outperforms strong baselines on five real-world datasets in both natural and artificially imbalanced settings, achieving notable gains in accuracy and macro F1 for minority classes, and it scales linearly with graph size in scalability tests. The results highlight the practical impact of edge-aware buffering for robust, heterophily-tolerant graph learning in imbalanced scenarios, with theoretical and ablation analyses supporting the contributions and suggesting avenues for future exploration of heterophily-aware graph augmentation.
Abstract
Class imbalance in graph-structured data, where minor classes are significantly underrepresented, poses a critical challenge for Graph Neural Networks (GNNs). To address this challenge, existing studies generally generate new minority nodes and edges connecting new nodes to the original graph to make classes balanced. However, they do not solve the problem that majority classes still propagate information to minority nodes by edges in the original graph which introduces bias towards majority classes. To address this, we introduce BuffGraph, which inserts buffer nodes into the graph, modulating the impact of majority classes to improve minor class representation. Our extensive experiments across diverse real-world datasets empirically demonstrate that BuffGraph outperforms existing baseline methods in class-imbalanced node classification in both natural settings and imbalanced settings. Code is available at https://anonymous.4open.science/r/BuffGraph-730A.
