Table of Contents
Fetching ...

Beyond the Known: Novel Class Discovery for Open-world Graph Learning

Yucheng Jin, Yun Xiong, Juncheng Fang, Xixi Wu, Dongxiao He, Xing Jia, Bingchen Zhao, Philip Yu

TL;DR

Open-world graph learning seeks to identify novel classes that emerge among unlabeled test nodes, a setting where inter-class edges can blur representations learned by conventional GNNs. The authors propose ORAL, which combines a prototypical attention network (PAN) to detect and remove inter-class correlations with a multi-layer, multi-scale approach that leverages unlabeled data through a pseudo-label generator and a structure-refinement module. The model uses semi-supervised prototypical clustering on a small prototype graph to compute group assignments and train with cross-entropy losses, while aligning and ensembling predictions across layers to generate reliable pseudo-labels and refine the graph structure via intra- and inter-class edge adjustments, guided by a permutation-consistent loss. Experiments on Cora, AmazonPhoto, and BlogCatalog demonstrate that ORAL consistently outperforms baselines in both known and novel class recognition and provides accurate estimation of the number of novel classes, highlighting its practical applicability for open-world graph understanding.

Abstract

Node classification on graphs is of great importance in many applications. Due to the limited labeling capability and evolution in real-world open scenarios, novel classes can emerge on unlabeled testing nodes. However, little attention has been paid to novel class discovery on graphs. Discovering novel classes is challenging as novel and known class nodes are correlated by edges, which makes their representations indistinguishable when applying message passing GNNs. Furthermore, the novel classes lack labeling information to guide the learning process. In this paper, we propose a novel method Open-world gRAph neuraL network (ORAL) to tackle these challenges. ORAL first detects correlations between classes through semi-supervised prototypical learning. Inter-class correlations are subsequently eliminated by the prototypical attention network, leading to distinctive representations for different classes. Furthermore, to fully explore multi-scale graph features for alleviating label deficiencies, ORAL generates pseudo-labels by aligning and ensembling label estimations from multiple stacked prototypical attention networks. Extensive experiments on several benchmark datasets show the effectiveness of our proposed method.

Beyond the Known: Novel Class Discovery for Open-world Graph Learning

TL;DR

Open-world graph learning seeks to identify novel classes that emerge among unlabeled test nodes, a setting where inter-class edges can blur representations learned by conventional GNNs. The authors propose ORAL, which combines a prototypical attention network (PAN) to detect and remove inter-class correlations with a multi-layer, multi-scale approach that leverages unlabeled data through a pseudo-label generator and a structure-refinement module. The model uses semi-supervised prototypical clustering on a small prototype graph to compute group assignments and train with cross-entropy losses, while aligning and ensembling predictions across layers to generate reliable pseudo-labels and refine the graph structure via intra- and inter-class edge adjustments, guided by a permutation-consistent loss. Experiments on Cora, AmazonPhoto, and BlogCatalog demonstrate that ORAL consistently outperforms baselines in both known and novel class recognition and provides accurate estimation of the number of novel classes, highlighting its practical applicability for open-world graph understanding.

Abstract

Node classification on graphs is of great importance in many applications. Due to the limited labeling capability and evolution in real-world open scenarios, novel classes can emerge on unlabeled testing nodes. However, little attention has been paid to novel class discovery on graphs. Discovering novel classes is challenging as novel and known class nodes are correlated by edges, which makes their representations indistinguishable when applying message passing GNNs. Furthermore, the novel classes lack labeling information to guide the learning process. In this paper, we propose a novel method Open-world gRAph neuraL network (ORAL) to tackle these challenges. ORAL first detects correlations between classes through semi-supervised prototypical learning. Inter-class correlations are subsequently eliminated by the prototypical attention network, leading to distinctive representations for different classes. Furthermore, to fully explore multi-scale graph features for alleviating label deficiencies, ORAL generates pseudo-labels by aligning and ensembling label estimations from multiple stacked prototypical attention networks. Extensive experiments on several benchmark datasets show the effectiveness of our proposed method.
Paper Structure (20 sections, 14 equations, 5 figures, 4 tables, 1 algorithm)

This paper contains 20 sections, 14 equations, 5 figures, 4 tables, 1 algorithm.

Figures (5)

  • Figure 1: Illustration of novel class discovery for open-world learning on an academic graph. Nodes represent papers, edges represent citation relationships, and each paper belongs to a certain research field (node class).
  • Figure 2: An overview of the proposed model ORAL.
  • Figure 3: Performance of baselines with different graph convolution networks. Our ORAL consistently outperforms all variants of existing open-world baseline methods, showing its superiority.
  • Figure 4: Performance of different prototypical attention network layers.
  • Figure 5: Impact of key hyper-parameters on performance of ORAL on the BlogCatlog dataset when the class number is known.