Table of Contents
Fetching ...

Replay-and-Forget-Free Graph Class-Incremental Learning: A Task Profiling and Prompting Approach

Chaoxi Niu, Guansong Pang, Ling Chen, Bing Liu

TL;DR

It is shown theoretically that accurate task ID prediction on graph data can be achieved by a Laplacian smoothing-based graph task profiling approach, in which each graph task is modeled by a task prototype based on Laplacian smoothing over the graph.

Abstract

Class-incremental learning (CIL) aims to continually learn a sequence of tasks, with each task consisting of a set of unique classes. Graph CIL (GCIL) follows the same setting but needs to deal with graph tasks (e.g., node classification in a graph). The key characteristic of CIL lies in the absence of task identifiers (IDs) during inference, which causes a significant challenge in separating classes from different tasks (i.e., inter-task class separation). Being able to accurately predict the task IDs can help address this issue, but it is a challenging problem. In this paper, we show theoretically that accurate task ID prediction on graph data can be achieved by a Laplacian smoothing-based graph task profiling approach, in which each graph task is modeled by a task prototype based on Laplacian smoothing over the graph. It guarantees that the task prototypes of the same graph task are nearly the same with a large smoothing step, while those of different tasks are distinct due to differences in graph structure and node attributes. Further, to avoid the catastrophic forgetting of the knowledge learned in previous graph tasks, we propose a novel graph prompting approach for GCIL which learns a small discriminative graph prompt for each task, essentially resulting in a separate classification model for each task. The prompt learning requires the training of a single graph neural network (GNN) only once on the first task, and no data replay is required thereafter, thereby obtaining a GCIL model being both replay-free and forget-free. Extensive experiments on four GCIL benchmarks show that i) our task prototype-based method can achieve 100% task ID prediction accuracy on all four datasets, ii) our GCIL model significantly outperforms state-of-the-art competing methods by at least 18% in average CIL accuracy, and iii) our model is fully free of forgetting on the four datasets.

Replay-and-Forget-Free Graph Class-Incremental Learning: A Task Profiling and Prompting Approach

TL;DR

It is shown theoretically that accurate task ID prediction on graph data can be achieved by a Laplacian smoothing-based graph task profiling approach, in which each graph task is modeled by a task prototype based on Laplacian smoothing over the graph.

Abstract

Class-incremental learning (CIL) aims to continually learn a sequence of tasks, with each task consisting of a set of unique classes. Graph CIL (GCIL) follows the same setting but needs to deal with graph tasks (e.g., node classification in a graph). The key characteristic of CIL lies in the absence of task identifiers (IDs) during inference, which causes a significant challenge in separating classes from different tasks (i.e., inter-task class separation). Being able to accurately predict the task IDs can help address this issue, but it is a challenging problem. In this paper, we show theoretically that accurate task ID prediction on graph data can be achieved by a Laplacian smoothing-based graph task profiling approach, in which each graph task is modeled by a task prototype based on Laplacian smoothing over the graph. It guarantees that the task prototypes of the same graph task are nearly the same with a large smoothing step, while those of different tasks are distinct due to differences in graph structure and node attributes. Further, to avoid the catastrophic forgetting of the knowledge learned in previous graph tasks, we propose a novel graph prompting approach for GCIL which learns a small discriminative graph prompt for each task, essentially resulting in a separate classification model for each task. The prompt learning requires the training of a single graph neural network (GNN) only once on the first task, and no data replay is required thereafter, thereby obtaining a GCIL model being both replay-free and forget-free. Extensive experiments on four GCIL benchmarks show that i) our task prototype-based method can achieve 100% task ID prediction accuracy on all four datasets, ii) our GCIL model significantly outperforms state-of-the-art competing methods by at least 18% in average CIL accuracy, and iii) our model is fully free of forgetting on the four datasets.

Paper Structure

This paper contains 25 sections, 4 theorems, 23 equations, 4 figures, 8 tables, 2 algorithms.

Key Result

Theorem 1

If graphs for all tasks are not isolated and the test graph $\mathcal{G}^{\text{test}}$ comes from the task $t$, i.e., $\mathcal{G}^{\text{test}}$ and $\mathcal{G}^{t}$ have the same set of classes, then the distance between $\mathbf{p}^{\text{test}}$ and $\mathbf{p}^t$ approaches to zero with a suf

Figures (4)

  • Figure 1: (a) Classification space of two graph tasks when no task ID is provided. The classification space is split into two separate spaces in Task 1 in (b) and Task 2 in (c) when the task ID can be accurately predicted. This helps alleviate the inter-task class separation issue. To mitigate catastrophic forgetting, we learn a graph prompt for each task that absorbs task-specific discriminative information for better class separation within each task, as shown in (d) and (e) respectively. This essentially results in a separate classification model for each task, achieving fully forget-free GCIL models.
  • Figure 2: Overview of the proposed TPP approach. During training, for each graph task $t$, the task prototype $\mathbf{p}^t$ is generated by applying Laplacian smoothing on the graph $\mathcal{G}^t$ and added to $\mathcal{P}=\{\mathbf{p}^1, \ldots, \mathbf{p}^{t-1}\}$. At the same time, the graph prompt $\Phi^t$ and the classification head $\varphi^t$ for this task are optimized on $\mathcal{G}^t$ through a frozen pre-trained GNN. During inference, the task ID of the test graph is first inferred (i.e., task identification). Then, the graph prompt and the classifier of the predicted task are retrieved to perform the node classification in GCIL. The GNN is trained on $\mathcal{G}^1$ and remains frozen for subsequent tasks.
  • Figure 3: The differences between two graphs in structure and node attributes.
  • Figure 4: (a) The AA results of TPP w.r.t. the size of the graph prompts. (b) Task ID prediction accuracy on all four datasets using Laplacian smoothing (LS) and its variant based on solely node features (NF).

Theorems & Definitions (6)

  • Theorem 1
  • Theorem 2
  • Theorem 2
  • proof
  • Theorem 2
  • proof