Table of Contents
Fetching ...

GOTHAM: Graph Class Incremental Learning Framework under Weak Supervision

Aditya Hemant Shahane, Prathosh A. P, Sandeep Kumar

TL;DR

GOTHAM tackles Graph Class Incremental Learning under Weak Supervision by building prototype-based representations from extended support sets and enriching them with semantic attributes for Text-Attributed Graphs. It combines episodic meta-learning with a multi-term metric-learning objective and teacher-student distillation to address few-shot and zero-shot novel classes while mitigating forgetting. Across GFSCIL and GCL settings, GOTHAM demonstrates consistent improvements on Cora-ML, Amazon, and OBGN-Arxiv, with semantic augmentation yielding particular gains on TAGs. The work advances robust node classification in dynamic graphs under limited supervision and provides a principled analysis of prototype distortion in evolving graphs.

Abstract

Graphs are growing rapidly, along with the number of distinct label categories associated with them. Applications like e-commerce, healthcare, recommendation systems, and various social media platforms are rapidly moving towards graph representation of data due to their ability to capture both structural and attribute information. One crucial task in graph analysis is node classification, where unlabeled nodes are categorized into predefined classes. In practice, novel classes appear incrementally sometimes with just a few labels (seen classes) or even without any labels (unseen classes), either because they are new or haven't been explored much. Traditional methods assume abundant labeled data for training, which isn't always feasible. We investigate a broader objective: \emph{Graph Class Incremental Learning under Weak Supervision (GCL)}, addressing this challenge by meta-training on base classes with limited labeled instances. During the incremental streams, novel classes can have few-shot or zero-shot representation. Our proposed framework GOTHAM efficiently accommodates these unlabeled nodes by finding the closest prototype representation, serving as class representatives in the attribute space. For Text-Attributed Graphs (TAGs), our framework additionally incorporates semantic information to enhance the representation. By employing teacher-student knowledge distillation to mitigate forgetting, GOTHAM achieves promising results across various tasks. Experiments on datasets such as Cora-ML, Amazon, and OBGN-Arxiv showcase the effectiveness of our approach in handling evolving graph data under limited supervision. The repository is available here: \href{https://github.com/adityashahane10/GOTHAM--Graph-based-Class-Incremental-Learning-Framework-under-Weak-Supervision}{\small \textcolor{blue}{Code}}

GOTHAM: Graph Class Incremental Learning Framework under Weak Supervision

TL;DR

GOTHAM tackles Graph Class Incremental Learning under Weak Supervision by building prototype-based representations from extended support sets and enriching them with semantic attributes for Text-Attributed Graphs. It combines episodic meta-learning with a multi-term metric-learning objective and teacher-student distillation to address few-shot and zero-shot novel classes while mitigating forgetting. Across GFSCIL and GCL settings, GOTHAM demonstrates consistent improvements on Cora-ML, Amazon, and OBGN-Arxiv, with semantic augmentation yielding particular gains on TAGs. The work advances robust node classification in dynamic graphs under limited supervision and provides a principled analysis of prototype distortion in evolving graphs.

Abstract

Graphs are growing rapidly, along with the number of distinct label categories associated with them. Applications like e-commerce, healthcare, recommendation systems, and various social media platforms are rapidly moving towards graph representation of data due to their ability to capture both structural and attribute information. One crucial task in graph analysis is node classification, where unlabeled nodes are categorized into predefined classes. In practice, novel classes appear incrementally sometimes with just a few labels (seen classes) or even without any labels (unseen classes), either because they are new or haven't been explored much. Traditional methods assume abundant labeled data for training, which isn't always feasible. We investigate a broader objective: \emph{Graph Class Incremental Learning under Weak Supervision (GCL)}, addressing this challenge by meta-training on base classes with limited labeled instances. During the incremental streams, novel classes can have few-shot or zero-shot representation. Our proposed framework GOTHAM efficiently accommodates these unlabeled nodes by finding the closest prototype representation, serving as class representatives in the attribute space. For Text-Attributed Graphs (TAGs), our framework additionally incorporates semantic information to enhance the representation. By employing teacher-student knowledge distillation to mitigate forgetting, GOTHAM achieves promising results across various tasks. Experiments on datasets such as Cora-ML, Amazon, and OBGN-Arxiv showcase the effectiveness of our approach in handling evolving graph data under limited supervision. The repository is available here: \href{https://github.com/adityashahane10/GOTHAM--Graph-based-Class-Incremental-Learning-Framework-under-Weak-Supervision}{\small \textcolor{blue}{Code}}

Paper Structure

This paper contains 20 sections, 31 equations, 5 figures, 6 tables, 1 algorithm.

Figures (5)

  • Figure 1: Graph Class Incremental Learning under Weak Supervision:(A) In the base graph $G^{base}$, the base classes $C^{base}$ have extremely limited labeled instances.(B) In the streaming sessions, graph $G^t$ has $C^t$ number of classes. Depending upon the availability of the training instances, the classes are further classified as $C^{t, S}$ (seen classes) and $C^{t, U}$ (unseen classes). Seen classes are represented with $k$-shots, along with semantic attributes (CSDs). For unseen classes, only CSD information is available. The goal, is to classify the unlabeled instances into ${C}^{{t}}$ classes encountered so far 10.1145/3534678.353928010.1145/3488560.3498455.
  • Figure 2: Prototype representation: For the GFSCIL task, we propose representing prototypes $(P_{c, S})$ using the averaged extended support set, as illustrated in (i). As demonstrated in (ii), we integrate semantic attributes (CSDs) to enhance the prototypes $(\overline{P}_{c, S})$ in TAGs. For GCL tasks with classes having no training instances, the semantic attributes (CSDs) are encoded as prototypes $(\overline{P}_{c, U})$.
  • Figure 3: GOTHAM III.o: At any time $t$, the framework uses the graph $G^t$ as input. The total classes are $C^t = C^{t, S} \cup C^{t, U}$. The steps are: (1) Create tasks ($T$) with support sets ($\mathcal{S}$) and query sets ($\mathcal{Q}$) for episodic learning. (2) Obtain prototype representations for each support set ($\mathcal{S}^{i}_x$). (3) Apply loss functions. (4) Use knowledge distillation to transfer knowledge from the teacher model to the student model. (5) Perform node classification.
  • Figure 4: (A) Contribution of different loss functions on the Cora-ML dataset. (B) Support set sampling: determining ideal random-walk length. (C) Different GNN backbones on Cora-ML and Amazon datasets. (A) and (C) displays performance vs streaming sessions, while (B) shows performance vs random-walk length.
  • Figure 5: Performance analysis of GOTHAM framework on OBGN-Arxiv and Cora-ML datasets. (Left): GCL with a 3-way $k$-shot setting shows consistent performance, even in zero-shot learning cases. (Right): GCL with the 1-way $k$-shot setting on Cora-ML.