Geometric Imbalance in Semi-Supervised Node Classification
Liang Yan, Shengzhong Zhang, Bisheng Li, Menglin Yang, Chen Yang, Min Zhou, Weiyang Ding, Yutong Xie, Zengfeng Huang
TL;DR
This work identifies geometric imbalance (GI) as an angular ambiguity in unit-sphere embeddings produced by graph neural networks under class imbalance, formalizing its link to prediction uncertainty via a von Mises–Fisher model on the hypersphere. It offers a unified, modular framework called UNREAL to mitigate GI through three components: DPAM for aligning pseudo-labels from clustering and prediction, Node-Reordering to fuse geometry and confidence while gradually shifting reliance from geometry to confidence, and DGIS to discard geometrically ambiguous samples. Theoretical results connect GI to entropy and imbalance ratio, while extensive experiments on nine benchmarks (including large-scale datasets) show consistent gains over state-of-the-art baselines, particularly as imbalance intensifies. The approach advances both theory and practice for robust semi-supervised node classification on imbalanced graphs, with potential extensions to other graph tasks.
Abstract
Class imbalance in graph data presents a significant challenge for effective node classification, particularly in semi-supervised scenarios. In this work, we formally introduce the concept of geometric imbalance, which captures how message passing on class-imbalanced graphs leads to geometric ambiguity among minority-class nodes in the riemannian manifold embedding space. We provide a rigorous theoretical analysis of geometric imbalance on the riemannian manifold and propose a unified framework that explicitly mitigates it through pseudo-label alignment, node reordering, and ambiguity filtering. Extensive experiments on diverse benchmarks show that our approach consistently outperforms existing methods, especially under severe class imbalance. Our findings offer new theoretical insights and practical tools for robust semi-supervised node classification.
