Table of Contents
Fetching ...

Class-Balanced and Reinforced Active Learning on Graphs

Chengcheng Yu, Jiapeng Zhu, Xiang Li

TL;DR

The paper addresses the problem of class imbalance during active learning on graphs by proposing a class-balanced reinforcement learning framework, GCBR, and its enhanced version GCBR++ with a punishment mechanism. It formulates class-balanced graph AL as a Markov Decision Process, defines a five-factor, class-balance-aware state, and designs a reward that blends validation performance with class diversity, optimized via Advantage Actor-Critic (A2C) using a two-layer GCN policy. The authors demonstrate that GCBR and especially GCBR++ yield superior Macro-F1 and better Imbalance Ratios across six benchmarks, with notable improvements on tail classes, and show robustness to budgets and hyperparameters. This work offers a scalable, graph-aware approach to fair and informative node annotation, with practical impact for reliable GNN training in skewed real-world graphs.

Abstract

Graph neural networks (GNNs) have demonstrated significant success in various applications, such as node classification, link prediction, and graph classification. Active learning for GNNs aims to query the valuable samples from the unlabeled data for annotation to maximize the GNNs' performance at a lower cost. However, most existing algorithms for reinforced active learning in GNNs may lead to a highly imbalanced class distribution, especially in highly skewed class scenarios. GNNs trained with class-imbalanced labeled data are susceptible to bias toward majority classes, and the lower performance of minority classes may lead to a decline in overall performance. To tackle this issue, we propose a novel class-balanced and reinforced active learning framework for GNNs, namely, GCBR. It learns an optimal policy to acquire class-balanced and informative nodes for annotation, maximizing the performance of GNNs trained with selected labeled nodes. GCBR designs class-balance-aware states, as well as a reward function that achieves trade-off between model performance and class balance. The reinforcement learning algorithm Advantage Actor-Critic (A2C) is employed to learn an optimal policy stably and efficiently. We further upgrade GCBR to GCBR++ by introducing a punishment mechanism to obtain a more class-balanced labeled set. Extensive experiments on multiple datasets demonstrate the effectiveness of the proposed approaches, achieving superior performance over state-of-the-art baselines.

Class-Balanced and Reinforced Active Learning on Graphs

TL;DR

The paper addresses the problem of class imbalance during active learning on graphs by proposing a class-balanced reinforcement learning framework, GCBR, and its enhanced version GCBR++ with a punishment mechanism. It formulates class-balanced graph AL as a Markov Decision Process, defines a five-factor, class-balance-aware state, and designs a reward that blends validation performance with class diversity, optimized via Advantage Actor-Critic (A2C) using a two-layer GCN policy. The authors demonstrate that GCBR and especially GCBR++ yield superior Macro-F1 and better Imbalance Ratios across six benchmarks, with notable improvements on tail classes, and show robustness to budgets and hyperparameters. This work offers a scalable, graph-aware approach to fair and informative node annotation, with practical impact for reliable GNN training in skewed real-world graphs.

Abstract

Graph neural networks (GNNs) have demonstrated significant success in various applications, such as node classification, link prediction, and graph classification. Active learning for GNNs aims to query the valuable samples from the unlabeled data for annotation to maximize the GNNs' performance at a lower cost. However, most existing algorithms for reinforced active learning in GNNs may lead to a highly imbalanced class distribution, especially in highly skewed class scenarios. GNNs trained with class-imbalanced labeled data are susceptible to bias toward majority classes, and the lower performance of minority classes may lead to a decline in overall performance. To tackle this issue, we propose a novel class-balanced and reinforced active learning framework for GNNs, namely, GCBR. It learns an optimal policy to acquire class-balanced and informative nodes for annotation, maximizing the performance of GNNs trained with selected labeled nodes. GCBR designs class-balance-aware states, as well as a reward function that achieves trade-off between model performance and class balance. The reinforcement learning algorithm Advantage Actor-Critic (A2C) is employed to learn an optimal policy stably and efficiently. We further upgrade GCBR to GCBR++ by introducing a punishment mechanism to obtain a more class-balanced labeled set. Extensive experiments on multiple datasets demonstrate the effectiveness of the proposed approaches, achieving superior performance over state-of-the-art baselines.
Paper Structure (28 sections, 16 equations, 6 figures, 3 tables, 1 algorithm)

This paper contains 28 sections, 16 equations, 6 figures, 3 tables, 1 algorithm.

Figures (6)

  • Figure 1: (a) Class distribution of labeled nodes acquired by GCBR++ is more balanced compared to GPA and ALLIE on Co-Phy dataset. (b) GCBR++ outperforms GPA and ALLIE on overall accuracy. GCBR++ improves the performance of tail classes by a large margin due to the class-balanced and valuable training data.
  • Figure 2: The framework of GCBR and GCBR++. Blue and red nodes denote the classes of labeled nodes, and blank nodes are unlabeled nodes. The policy will query a more class-balanced and valuable node for annotation at each step.
  • Figure 3: Node classification results (Macro-F1, Micro-F1) and Imb-ratio on Reddit and Co-Phy under different test budgets.
  • Figure 4: The performance (Micro-F1, Macro-F1) and imbalance ratio of GCBR under different scaling factors $\alpha$.
  • Figure 5: The performance (Micro-F1, Macro-F1) and imbalance ratio of GCBR++ under different penalty scores $\eta$.
  • ...and 1 more figures