Rethinking Semi-Supervised Imbalanced Node Classification from Bias-Variance Decomposition
Liang Yan, Gengchen Wei, Chen Yang, Shengzhong Zhang, Zengfeng Huang
TL;DR
This work tackles imbalanced node classification on graphs by reinterpreting the problem through Bias-Variance Decomposition and linking imbalance to increased model variance. It proposes ReVar, a variance-regularized semi-supervised framework that uses graph augmentations to estimate variance and a class-center–based regularization to compensate minority classes, optimized via a two-view objective. The approach combines variance regularization with intra-class aggregation to produce robust representations, achieving state-of-the-art results on both publicly imbalanced and naturally imbalanced graph benchmarks. The work provides a theoretical lens and practical algorithmic tools that improve minority-class performance and offer a path toward principled design of GNNs under data imbalance.
Abstract
This paper introduces a new approach to address the issue of class imbalance in graph neural networks (GNNs) for learning on graph-structured data. Our approach integrates imbalanced node classification and Bias-Variance Decomposition, establishing a theoretical framework that closely relates data imbalance to model variance. We also leverage graph augmentation technique to estimate the variance, and design a regularization term to alleviate the impact of imbalance. Exhaustive tests are conducted on multiple benchmarks, including naturally imbalanced datasets and public-split class-imbalanced datasets, demonstrating that our approach outperforms state-of-the-art methods in various imbalanced scenarios. This work provides a novel theoretical perspective for addressing the problem of imbalanced node classification in GNNs.
