Table of Contents
Fetching ...

Open-World Semi-Supervised Learning for Node Classification

Yanling Wang, Jing Zhang, Lingxi Zhang, Lixin Liu, Yuxiao Dong, Cuiping Li, Hong Chen, Hongzhi Yin

TL;DR

This paper proposes an IMbalance-A ware method named OpenIMA for Open-world semi-supervised node classification, which trains the node classification model from scratch via contrastive learning with bias-reduced pseudo labels.

Abstract

Open-world semi-supervised learning (Open-world SSL) for node classification, that classifies unlabeled nodes into seen classes or multiple novel classes, is a practical but under-explored problem in the graph community. As only seen classes have human labels, they are usually better learned than novel classes, and thus exhibit smaller intra-class variances within the embedding space (named as imbalance of intra-class variances between seen and novel classes). Based on empirical and theoretical analysis, we find the variance imbalance can negatively impact the model performance. Pre-trained feature encoders can alleviate this issue via producing compact representations for novel classes. However, creating general pre-trained encoders for various types of graph data has been proven to be challenging. As such, there is a demand for an effective method that does not rely on pre-trained graph encoders. In this paper, we propose an IMbalance-Aware method named OpenIMA for Open-world semi-supervised node classification, which trains the node classification model from scratch via contrastive learning with bias-reduced pseudo labels. Extensive experiments on seven popular graph benchmarks demonstrate the effectiveness of OpenIMA, and the source code has been available on GitHub.

Open-World Semi-Supervised Learning for Node Classification

TL;DR

This paper proposes an IMbalance-A ware method named OpenIMA for Open-world semi-supervised node classification, which trains the node classification model from scratch via contrastive learning with bias-reduced pseudo labels.

Abstract

Open-world semi-supervised learning (Open-world SSL) for node classification, that classifies unlabeled nodes into seen classes or multiple novel classes, is a practical but under-explored problem in the graph community. As only seen classes have human labels, they are usually better learned than novel classes, and thus exhibit smaller intra-class variances within the embedding space (named as imbalance of intra-class variances between seen and novel classes). Based on empirical and theoretical analysis, we find the variance imbalance can negatively impact the model performance. Pre-trained feature encoders can alleviate this issue via producing compact representations for novel classes. However, creating general pre-trained encoders for various types of graph data has been proven to be challenging. As such, there is a demand for an effective method that does not rely on pre-trained graph encoders. In this paper, we propose an IMbalance-Aware method named OpenIMA for Open-world semi-supervised node classification, which trains the node classification model from scratch via contrastive learning with bias-reduced pseudo labels. Extensive experiments on seven popular graph benchmarks demonstrate the effectiveness of OpenIMA, and the source code has been available on GitHub.
Paper Structure (26 sections, 2 theorems, 38 equations, 2 figures, 7 tables)

This paper contains 26 sections, 2 theorems, 38 equations, 2 figures, 7 tables.

Key Result

Theorem 1

With $1<\gamma < 2$, for any $\delta$, there exists a constant $\overline{N}$, if the number of samples $N \geq \overline{N}$, with a possibility at least 1-$\delta$, (1) if $1.5<\alpha<3$, $ACC_2$ and $\sigma_1$ a.s. are positively correlated; (2) if $\alpha>3$, $|1- ACC_1|<0.05$ and $|1- ACC_2|<0.

Figures (2)

  • Figure 1: Motivations of this work. (a) The solid circles represent labeled nodes, and the dashed circles represent unlabeled nodes. (b) The results are averaged over ten runs. Model performance on other datasets are reported in Table \ref{['tab:overall_result']}.
  • Figure 2: Effects of hyper-parameters $\eta$ and $\rho$.

Theorems & Definitions (3)

  • Definition 1
  • Theorem 1
  • Lemma 1