Table of Contents
Fetching ...

Distribution Consistency based Self-Training for Graph Neural Networks with Sparse Labels

Fali Wang, Tianxiang Zhao, Suhang Wang

TL;DR

The paper tackles few-shot node classification under distribution shift by introducing Distribution Consistency based Graph Self-Training (DC-GST). It jointly learns a distribution-aware teacher with an Edge Predictor to augment graph structure and minimize representation distance between training and test sets, while selecting high-quality pseudo-labels via a differentiable DC and Neighborhood Entropy Reduction mechanism. The final student model is trained on the augmented set, leading to improved accuracy across four benchmarks and several backbones. This approach directly addresses shifts introduced both before and during pseudo-labeling, offering a practical boost for real-world graph learning with sparse labels.

Abstract

Few-shot node classification poses a significant challenge for Graph Neural Networks (GNNs) due to insufficient supervision and potential distribution shifts between labeled and unlabeled nodes. Self-training has emerged as a widely popular framework to leverage the abundance of unlabeled data, which expands the training set by assigning pseudo-labels to selected unlabeled nodes. Efforts have been made to develop various selection strategies based on confidence, information gain, etc. However, none of these methods takes into account the distribution shift between the training and testing node sets. The pseudo-labeling step may amplify this shift and even introduce new ones, hindering the effectiveness of self-training. Therefore, in this work, we explore the potential of explicitly bridging the distribution shift between the expanded training set and test set during self-training. To this end, we propose a novel Distribution-Consistent Graph Self-Training (DC-GST) framework to identify pseudo-labeled nodes that are both informative and capable of redeeming the distribution discrepancy and formulate it as a differentiable optimization task. A distribution-shift-aware edge predictor is further adopted to augment the graph and increase the model's generalizability in assigning pseudo labels. We evaluate our proposed method on four publicly available benchmark datasets and extensive experiments demonstrate that our framework consistently outperforms state-of-the-art baselines.

Distribution Consistency based Self-Training for Graph Neural Networks with Sparse Labels

TL;DR

The paper tackles few-shot node classification under distribution shift by introducing Distribution Consistency based Graph Self-Training (DC-GST). It jointly learns a distribution-aware teacher with an Edge Predictor to augment graph structure and minimize representation distance between training and test sets, while selecting high-quality pseudo-labels via a differentiable DC and Neighborhood Entropy Reduction mechanism. The final student model is trained on the augmented set, leading to improved accuracy across four benchmarks and several backbones. This approach directly addresses shifts introduced both before and during pseudo-labeling, offering a practical boost for real-world graph learning with sparse labels.

Abstract

Few-shot node classification poses a significant challenge for Graph Neural Networks (GNNs) due to insufficient supervision and potential distribution shifts between labeled and unlabeled nodes. Self-training has emerged as a widely popular framework to leverage the abundance of unlabeled data, which expands the training set by assigning pseudo-labels to selected unlabeled nodes. Efforts have been made to develop various selection strategies based on confidence, information gain, etc. However, none of these methods takes into account the distribution shift between the training and testing node sets. The pseudo-labeling step may amplify this shift and even introduce new ones, hindering the effectiveness of self-training. Therefore, in this work, we explore the potential of explicitly bridging the distribution shift between the expanded training set and test set during self-training. To this end, we propose a novel Distribution-Consistent Graph Self-Training (DC-GST) framework to identify pseudo-labeled nodes that are both informative and capable of redeeming the distribution discrepancy and formulate it as a differentiable optimization task. A distribution-shift-aware edge predictor is further adopted to augment the graph and increase the model's generalizability in assigning pseudo labels. We evaluate our proposed method on four publicly available benchmark datasets and extensive experiments demonstrate that our framework consistently outperforms state-of-the-art baselines.
Paper Structure (32 sections, 15 equations, 6 figures, 5 tables, 1 algorithm)

This paper contains 32 sections, 15 equations, 6 figures, 5 tables, 1 algorithm.

Figures (6)

  • Figure 1: CMD on Cora: minimal distribution shift with the proposed method using GCN backbone against advanced self-training methods including ST, M3S, and DR-GST. Note that the value from GCN signifies the initial shift, as the CMD computation is from the high-level embedding space.
  • Figure 2: The comparison between original and variant graphs. Higher consistency (e.g., in circled areas) can be achieved with EP(CMD).
  • Figure 3: The proposed distribution consistency based graph self-training framework. Red arrows indicate the loop.
  • Figure 4: Sensitivity analysis of hyper-parameters w.r.t. $\alpha$, $\beta$, and $\gamma$ on Cora and Citeseer with 0.5% label rate.
  • Figure 5: Visual Analysis of CMD and Accuracy on Citeseer with a 2% Label Rate. Sub-figures (a) and (b) illustrate CMD and ACC for DC-GST on random and biased training samples respectively, while (c) and (d) display M3S's CMD and ACC under the same conditions. The shadowed area refers to the variance of the results of the 10 runs.
  • ...and 1 more figures

Theorems & Definitions (1)

  • definition 1: Distribution shift in GNNs