Towards Continuous Reuse of Graph Models via Holistic Memory Diversification

Ziyue Qiao; Junren Xiao; Qingqiang Sun; Meng Xiao; Xiao Luo; Hui Xiong

Towards Continuous Reuse of Graph Models via Holistic Memory Diversification

Ziyue Qiao, Junren Xiao, Qingqiang Sun, Meng Xiao, Xiao Luo, Hui Xiong

TL;DR

This work tackles continual learning on growing graphs, where new tasks with unseen classes arrive over time and memory is constrained. It introduces Diversified Memory Selection and Generative Replay (DMSG), which jointly optimizes intra- and inter-class memory diversity via a greedy buffer-selection strategy and expands replay diversity through a variational embedding generator with adversarial regularization and a reconstruction-based decoder. The approach yields strong empirical gains on four benchmark graphs, closely approaching joint-training performance while reducing forgetting, and demonstrates the value of memory diversification for scalable graph continual learning. Practically, DMSG enables continuous reuse of graph models in dynamic domains (e.g., social networks, knowledge graphs) with limited memory, by preserving broad and high-quality past knowledge during successive task learning.

Abstract

This paper addresses the challenge of incremental learning in growing graphs with increasingly complex tasks. The goal is to continuously train a graph model to handle new tasks while retaining proficiency in previous tasks via memory replay. Existing methods usually overlook the importance of memory diversity, limiting in selecting high-quality memory from previous tasks and remembering broad previous knowledge within the scarce memory on graphs. To address that, we introduce a novel holistic Diversified Memory Selection and Generation (DMSG) framework for incremental learning in graphs, which first introduces a buffer selection strategy that considers both intra-class and inter-class diversities, employing an efficient greedy algorithm for sampling representative training nodes from graphs into memory buffers after learning each new task. Then, to adequately rememorize the knowledge preserved in the memory buffer when learning new tasks, a diversified memory generation replay method is introduced. This method utilizes a variational layer to generate the distribution of buffer node embeddings and sample synthesized ones for replaying. Furthermore, an adversarial variational embedding learning method and a reconstruction-based decoder are proposed to maintain the integrity and consolidate the generalization of the synthesized node embeddings, respectively. Extensive experimental results on publicly accessible datasets demonstrate the superiority of \method{} over state-of-the-art methods.

Towards Continuous Reuse of Graph Models via Holistic Memory Diversification

TL;DR

Abstract

Paper Structure (27 sections, 7 theorems, 33 equations, 8 figures, 4 tables, 1 algorithm)

This paper contains 27 sections, 7 theorems, 33 equations, 8 figures, 4 tables, 1 algorithm.

Introduction
Problem Formulation
Methodology
Heuristic Diversified Memory Selection
Diversified Memory Generative Replay
Overall Optimization
Experiments
Related Works
Conclusion
Acknowledgment
Appendix
Extended Analysis of Theorem \ref{['th:1']}
Theoretical Analysis of Greedy Algorithm \ref{['alg:buffer']}.
Derivation of Loss $\mathcal{L}_{CGSE}$.
Synchronized Min-Max Optimizating of Overall Loss $\mathcal{L}_{DMSG}$.
...and 12 more sections

Key Result

Theorem 1

Let the loss function $\mathcal{L}(\theta, x)$ be $\beta$-Lipschitz continuous in respect to the input $x$. Under this condition, the discrepancy between the expected loss under the true data distribution $p(G_{<t})$ and that under the replay buffer distribution $q(\mathcal{B}_{<t})$ is bounded as f where $W(p, q)$ denotes the Wasserstein distance between distributions $p$ and $q$, defined by: $W(

Figures (8)

Figure 1: An example of incremental learning in growing graphs, where nodes with distinct labels are shaded in various colors. The number of classes expands as the graph grows, causing increasingly complex classification tasks.
Figure 2: The framework of DMSG. In this instance, the graph model underwent training on a 2-class node classification task on $G^1$. Two new classes of nodes are added to $G^1$ to form $G^2$. Certain nodes of the previous two classes are first selected into buffers. Then, the model is further trained on the two new classes of nodes and buffers to perform incremental learning.
Figure 3: Dynamics of the average accuracy during incremental learning on different growing graphs.
Figure 4: Accuracy matrices of DMSG and SEM in different datasets.
Figure 5: Cumulative number of nodes within different growing graphs.
...and 3 more figures

Theorems & Definitions (12)

Theorem 1
Proposition 1
proof
Proposition 2
Theorem 2
proof
Lemma 1
proof
Lemma 2
proof
...and 2 more

Towards Continuous Reuse of Graph Models via Holistic Memory Diversification

TL;DR

Abstract

Towards Continuous Reuse of Graph Models via Holistic Memory Diversification

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (12)