Cluster-guided Contrastive Class-imbalanced Graph Classification
Wei Ju, Zhengyang Mao, Siyu Yi, Yifang Qin, Yiyang Gu, Zhiping Xiao, Jianhao Shen, Ziyue Qiao, Ming Zhang
TL;DR
This work tackles class-imbalanced graph classification by introducing C$^3$GNN, which blends adaptive clustering, subclass-level Mixup, and hierarchical supervised contrastive learning to balance learning across majority and minority classes. By decomposing majority classes into semantically coherent subclasses with comparable sizes to minority classes and enriching them via Mixup, the method enables balanced, fine-grained representation learning across a hierarchy of class and subclass labels. The approach demonstrates superior performance over diverse baselines on six Benchmarks, with ablations validating the contribution of each component and analyses confirming balanced, informative feature distributions. The proposed framework offers a practical pathway to robust graph classification in real-world imbalanced domains, leveraging hierarchical structure to prevent minority-class overfitting while preserving majority-class richness.
Abstract
This paper studies the problem of class-imbalanced graph classification, which aims at effectively classifying the graph categories in scenarios with imbalanced class distributions. While graph neural networks (GNNs) have achieved remarkable success, their modeling ability on imbalanced graph-structured data remains suboptimal, which typically leads to predictions biased towards the majority classes. On the other hand, existing class-imbalanced learning methods in vision may overlook the rich graph semantic substructures of the majority classes and excessively emphasize learning from the minority classes. To address these challenges, we propose a simple yet powerful approach called C$^3$GNN that integrates the idea of clustering into contrastive learning to enhance class-imbalanced graph classification. Technically, C$^3$GNN clusters graphs from each majority class into multiple subclasses, with sizes comparable to the minority class, mitigating class imbalance. It also employs the Mixup technique to generate synthetic samples, enriching the semantic diversity of each subclass. Furthermore, supervised contrastive learning is used to hierarchically learn effective graph representations, enabling the model to thoroughly explore semantic substructures in majority classes while avoiding excessive focus on minority classes. Extensive experiments on real-world graph benchmark datasets verify the superior performance of our proposed method against competitive baselines.
