Table of Contents
Fetching ...

On the Limitation and Experience Replay for GNNs in Continual Learning

Junwei Su, Difan Zou, Chuan Wu

TL;DR

This paper introduces the first theoretical exploration of the learnability of GNN in NGCL, revealing that learnability is heavily influenced by structural shifts due to the interconnected nature of graph data.

Abstract

Continual learning seeks to empower models to progressively acquire information from a sequence of tasks. This approach is crucial for many real-world systems, which are dynamic and evolve over time. Recent research has witnessed a surge in the exploration of Graph Neural Networks (GNN) in Node-wise Graph Continual Learning (NGCL), a practical yet challenging paradigm involving the continual training of a GNN on node-related tasks. Despite recent advancements in continual learning strategies for GNNs in NGCL, a thorough theoretical understanding, especially regarding its learnability, is lacking. Learnability concerns the existence of a learning algorithm that can produce a good candidate model from the hypothesis/weight space, which is crucial for model selection in NGCL development. This paper introduces the first theoretical exploration of the learnability of GNN in NGCL, revealing that learnability is heavily influenced by structural shifts due to the interconnected nature of graph data. Specifically, GNNs may not be viable for NGCL under significant structural changes, emphasizing the need to manage structural shifts. To mitigate the impact of structural shifts, we propose a novel experience replay method termed Structure-Evolution-Aware Experience Replay (SEA-ER). SEA-ER features an innovative experience selection strategy that capitalizes on the topological awareness of GNNs, alongside a unique replay strategy that employs structural alignment to effectively counter catastrophic forgetting and diminish the impact of structural shifts on GNNs in NGCL. Our extensive experiments validate our theoretical insights and the effectiveness of SEA-ER.

On the Limitation and Experience Replay for GNNs in Continual Learning

TL;DR

This paper introduces the first theoretical exploration of the learnability of GNN in NGCL, revealing that learnability is heavily influenced by structural shifts due to the interconnected nature of graph data.

Abstract

Continual learning seeks to empower models to progressively acquire information from a sequence of tasks. This approach is crucial for many real-world systems, which are dynamic and evolve over time. Recent research has witnessed a surge in the exploration of Graph Neural Networks (GNN) in Node-wise Graph Continual Learning (NGCL), a practical yet challenging paradigm involving the continual training of a GNN on node-related tasks. Despite recent advancements in continual learning strategies for GNNs in NGCL, a thorough theoretical understanding, especially regarding its learnability, is lacking. Learnability concerns the existence of a learning algorithm that can produce a good candidate model from the hypothesis/weight space, which is crucial for model selection in NGCL development. This paper introduces the first theoretical exploration of the learnability of GNN in NGCL, revealing that learnability is heavily influenced by structural shifts due to the interconnected nature of graph data. Specifically, GNNs may not be viable for NGCL under significant structural changes, emphasizing the need to manage structural shifts. To mitigate the impact of structural shifts, we propose a novel experience replay method termed Structure-Evolution-Aware Experience Replay (SEA-ER). SEA-ER features an innovative experience selection strategy that capitalizes on the topological awareness of GNNs, alongside a unique replay strategy that employs structural alignment to effectively counter catastrophic forgetting and diminish the impact of structural shifts on GNNs in NGCL. Our extensive experiments validate our theoretical insights and the effectiveness of SEA-ER.
Paper Structure (46 sections, 11 theorems, 25 equations, 9 figures, 3 tables, 3 algorithms)

This paper contains 46 sections, 11 theorems, 25 equations, 9 figures, 3 tables, 3 algorithms.

Key Result

theorem 1

Let $\mathcal{F}$ be a hypothesis space over $\mathbb{G} \times \mathcal{Y}$ captured by GNNs. If there is no control on then, for every $c > 0$, there exists an instance of NGCL-2 problem satisfying the following:

Figures (9)

  • Figure 1: Illustration of the progression of NGCL. Task $2$ introduces a new batch of vertices and results in an update to the parameters of the model from $W_{2}$ to $W_{1}$ using data from the new task. As new vertex batches associated with the new task $\tau_2$ are introduced, the graph structure changes, potentially altering the graph structure (inputs to GNNs) of the existing vertices, as captured by the changes in their ego graphs. This is referred to as the structural shift.
  • Figure 2: Forgetting Dynamics of Bare Model on Arxiv under Settings with Constant and Evolving Graphs. (a) captures the change of catastrophic forgetting of task 1 when transitioning into task 2 under the settings with constant and evolving graphs. (b) and (c) are the complete catastrophic forgetting matrix ($x,y$-axis are the $i,j$ in $r_{i,j}-r_{i,i}$ correspondingly) of Bare model in the inductive and transductive settings. The colour density indicates the amount of forgetting(denser color means larger forgetting).
  • Figure 3: Fig. \ref{['fig:cSBM_dependency']} is the experiments with different configurations of cSBM models. Fig. \ref{['fig:distortion']} is the distortion between graph distance and embedding distance. Fig. \ref{['fig:buffer_size']} is the experiment on the effect of replay buffer size (without the structural alignment).
  • Figure 4: Performance Matrix of Bare Model on Transductive and Inductive Setting on CoraFull and Reddit Datasets.
  • Figure 5: Forgetting Performance Matrix of SEA-ER on CoraFull and Reddit Datasets.
  • ...and 4 more figures

Theorems & Definitions (23)

  • definition 1: Expressiveness of hypothesis space
  • definition 2
  • theorem 1: necessity of controllable structural shift, informal
  • remark 1
  • remark 2
  • definition 3: distortion rate
  • proposition 1
  • proposition 2
  • definition 4: DA-learnability
  • definition 5: NGCL-2-learnability
  • ...and 13 more