Towards Robust Graph Incremental Learning on Evolving Graphs

Junwei Su; Difan Zou; Zijun Zhang; Chuan Wu

Towards Robust Graph Incremental Learning on Evolving Graphs

Junwei Su, Difan Zou, Zijun Zhang, Chuan Wu

TL;DR

This paper provides a formal formulation and analysis of the inductive NGIL problem, and proposes a novel regularization-based technique called Structural-Shift-Risk-Mitigation (SSRM) to mitigate the impact of the structural shift on catastrophic forgetting of the inductive NGIL problem.

Abstract

Incremental learning is a machine learning approach that involves training a model on a sequence of tasks, rather than all tasks at once. This ability to learn incrementally from a stream of tasks is crucial for many real-world applications. However, incremental learning is a challenging problem on graph-structured data, as many graph-related problems involve prediction tasks for each individual node, known as Node-wise Graph Incremental Learning (NGIL). This introduces non-independent and non-identically distributed characteristics in the sample data generation process, making it difficult to maintain the performance of the model as new tasks are added. In this paper, we focus on the inductive NGIL problem, which accounts for the evolution of graph structure (structural shift) induced by emerging tasks. We provide a formal formulation and analysis of the problem, and propose a novel regularization-based technique called Structural-Shift-Risk-Mitigation (SSRM) to mitigate the impact of the structural shift on catastrophic forgetting of the inductive NGIL problem. We show that the structural shift can lead to a shift in the input distribution for the existing tasks, and further lead to an increased risk of catastrophic forgetting. Through comprehensive empirical studies with several benchmark datasets, we demonstrate that our proposed method, Structural-Shift-Risk-Mitigation (SSRM), is flexible and easy to adapt to improve the performance of state-of-the-art GNN incremental learning frameworks in the inductive setting.

Towards Robust Graph Incremental Learning on Evolving Graphs

TL;DR

Abstract

Paper Structure (37 sections, 4 theorems, 25 equations, 11 figures, 4 tables, 1 algorithm)

This paper contains 37 sections, 4 theorems, 25 equations, 11 figures, 4 tables, 1 algorithm.

Introduction
Related Work
Incremental Learning
Graph Incremental Learning
Preliminary and Problem Formulation
Structural Shift and Catastrophic Forgetting
Structural Shift
Structural Shift on Catastrophic Forgetting Risk
Structural Shift Risk Mitigation
Experiment
Datasets and Experimental Set-up.
Incremental Learning Frameworks.
Evaluation Metric.
Results
Difference in Transductive and Inductive.
...and 22 more sections

Key Result

Proposition 4.1

If $\frac{C_1(V_1)}{C_2(V_1)} \neq \frac{C_1(V_2)}{C_2(V_2)},$ then we have that $\mathds{E}[\text{mean-agg}(v)|\mathcal{G}_{\mathcal{T}_1}] \neq \mathds{E}[\text{mean-agg}(v)|\mathcal{G}_{\mathcal{T}_2}], \forall v \in V_1.$

Figures (11)

Figure 1: An illustration of the difference between transductive and inductive NGIL. $\mathcal{T}_1$ and $\mathcal{T}_2$ are two consecutive tasks. In the transductive setting, the 1-hop ego graph of the vertex with a red circle would remain the same, while in the inductive setting, the graph may change as new tasks are introduced and the overall graph structure evolves.
Figure 2: Illustration of the Progression in Inductive NGIL. Each task $i$ results in an update to the model's parameters from $\theta_{i-1}$ to $\theta_{i}$ using data from the new task. As new vertex batches associated with each task are introduced, the graph structure changes, potentially altering the input distribution of existing vertices through changes in their ego graphs.
Figure 3: Learning Dynamics of Bare Model on Arxiv in Transductive and Inductive Settings. (a) captures the change of model performance of task 1 when transitioning into task 2 in the inductive and transductive settings. (b) and (c) are the complete performance matrix ($x,y$-axis are the $i,j$ in $r_{i,j}$ correspondingly) of Bare model in the inductive and transductive settings.
Figure 4: Learning Dynamic of ER-GNN on Arxiv w/w.o. SSRM. (a) and (b) are the complete performance matrix ($x,y$-axis are $i,j$ in $r_{i,j}$ correspondingly) of ER-GNN on Arxiv w./w.o. SSRM. (c) is the learning curve of the two settings illustrating that SSRM leads to a higher APS for each task.
Figure 5: Parameter sensitivity of SSRM. x-axis are the different datasets and the y-axis is FAP. The results are average of five trials. (a) captures the model performance (FAP) when varying $\alpha$ with $\beta = 0$. (b) captures the model performance (FAP) when varying $\beta$ with $\alpha = 0$. $\alpha$ and $\beta$ are the two hyperparameters used in equation \ref{['eq:learning_objective']} to control the regularization effect.
...and 6 more figures

Theorems & Definitions (8)

Proposition 4.1: Imbalanced Observation
Definition 4.2: Maximum Mean Discrepancy
Theorem 4.3: CFR Bound
Theorem 5.1: Induced CFR Bound
proof
Lemma 2.1
proof
proof

Towards Robust Graph Incremental Learning on Evolving Graphs

TL;DR

Abstract

Towards Robust Graph Incremental Learning on Evolving Graphs

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (8)