Table of Contents
Fetching ...

Graph Continual Learning with Debiased Lossless Memory Replay

Chaoxi Niu, Guansong Pang, Ling Chen

TL;DR

This paper introduces a novel framework, called Debiased Lossless Memory replay (DeLoMe), which learns small lossless synthetic node representations as the memory and can not only preserve the graph data privacy but also capture the holistic graph information, for which the sampling-based methods are not viable.

Abstract

Real-life graph data often expands continually, rendering the learning of graph neural networks (GNNs) on static graph data impractical. Graph continual learning (GCL) tackles this problem by continually adapting GNNs to the expanded graph of the current task while maintaining the performance over the graph of previous tasks. Memory replay-based methods, which aim to replay data of previous tasks when learning new tasks, have been explored as one principled approach to mitigate the forgetting of the knowledge learned from the previous tasks. In this paper we extend this methodology with a novel framework, called Debiased Lossless Memory replay (DeLoMe). Unlike existing methods that sample nodes/edges of previous graphs to construct the memory, DeLoMe learns small lossless synthetic node representations as the memory. The learned memory can not only preserve the graph data privacy but also capture the holistic graph information, for which the sampling-based methods are not viable. Further, prior methods suffer from bias toward the current task due to the data imbalance between the classes in the memory data and the current data. A debiased GCL loss function is devised in DeLoMe to effectively alleviate this bias. Extensive experiments on four graph datasets show the effectiveness of DeLoMe under both class- and task-incremental learning settings.

Graph Continual Learning with Debiased Lossless Memory Replay

TL;DR

This paper introduces a novel framework, called Debiased Lossless Memory replay (DeLoMe), which learns small lossless synthetic node representations as the memory and can not only preserve the graph data privacy but also capture the holistic graph information, for which the sampling-based methods are not viable.

Abstract

Real-life graph data often expands continually, rendering the learning of graph neural networks (GNNs) on static graph data impractical. Graph continual learning (GCL) tackles this problem by continually adapting GNNs to the expanded graph of the current task while maintaining the performance over the graph of previous tasks. Memory replay-based methods, which aim to replay data of previous tasks when learning new tasks, have been explored as one principled approach to mitigate the forgetting of the knowledge learned from the previous tasks. In this paper we extend this methodology with a novel framework, called Debiased Lossless Memory replay (DeLoMe). Unlike existing methods that sample nodes/edges of previous graphs to construct the memory, DeLoMe learns small lossless synthetic node representations as the memory. The learned memory can not only preserve the graph data privacy but also capture the holistic graph information, for which the sampling-based methods are not viable. Further, prior methods suffer from bias toward the current task due to the data imbalance between the classes in the memory data and the current data. A debiased GCL loss function is devised in DeLoMe to effectively alleviate this bias. Extensive experiments on four graph datasets show the effectiveness of DeLoMe under both class- and task-incremental learning settings.
Paper Structure (35 sections, 9 equations, 6 figures, 7 tables, 2 algorithms)

This paper contains 35 sections, 9 equations, 6 figures, 7 tables, 2 algorithms.

Figures (6)

  • Figure 1: Left: (a) Current replay-based methods use a sampling-based memory consisting of partial sampled graph data, (b) whereas our approach learns to generate the memory using a lossless small graph with synthetic node representations. Right: Average accuracy (AA) of three replay methods -- SSM, CaT and our DeLoMe -- with increasing imbalance rates on a real-world dataset ArXiv.
  • Figure 2: Overview of DeLoMe. We take two consecutive tasks ($\mathcal{G}_{t-1}$ and $\mathcal{G}_{t}$) as an example, where $\text{GNN}_{t-1}$ and $\text{GNN}_{t}$ represent the GNN model trained after the task $t-1$ and $t$ respectively. At task $t-1$, we learn synthetic node representation-based memory $\hat{\mathcal{G}}_{t-1}$ for $\mathcal{G}_{t-1}$ and add it to the memory buffer via $\mathcal{B}_t=\mathcal{B}_{t-1}\cup \hat{\mathcal{G}}_{t-1}$. At task $t$, the memory buffer $\mathcal{B}_t$ is replayed with the current graph data $\mathcal{G}_t$ to train the model $\text{GNN}_{t}$ using our debiased GCL objective. The process is repeated until all the tasks are learned.
  • Figure 3: Visualization of node embeddings of the original graph and the memories obtained by different methods on this graph.
  • Figure 4: Average accuracy (AA) of DeLoMe against SOTA sampling-based memory construction methods on all tasks.
  • Figure 5: AA results with different memory budgets on CoraFull and Arxiv under the class-incremental learning.
  • ...and 1 more figures