Table of Contents
Fetching ...

A Topology-aware Graph Coarsening Framework for Continual Graph Learning

Xiaoxue Han, Zhuo Feng, Yue Ning

TL;DR

The paper addresses catastrophic forgetting in continual graph learning by introducing TA\mathbb{CO}, a topology-aware framework that compresses past graphs into a reduced representation to preserve topology and exploit inter-task correlations. It combines new and past graphs, then uses RePro, an embedding-proximity-based coarsening method, to produce a compact replay graph, with Node Fidelity Preservation safeguarding minority classes. The approach achieves superior performance and reduced forgetting across three real-world time-stamped datasets (Kindle, DBLP, ACM) using multiple GNN backbones, and demonstrates efficiency gains over existing baselines. By restricting memory growth and preserving structural information, TA\mathbb{CO} offers a scalable solution for continual learning on evolving graphs with practical impact in streaming graph applications.

Abstract

Continual learning on graphs tackles the problem of training a graph neural network (GNN) where graph data arrive in a streaming fashion and the model tends to forget knowledge from previous tasks when updating with new data. Traditional continual learning strategies such as Experience Replay can be adapted to streaming graphs, however, these methods often face challenges such as inefficiency in preserving graph topology and incapability of capturing the correlation between old and new tasks. To address these challenges, we propose TA$\mathbb{CO}$, a (t)opology-(a)ware graph (co)arsening and (co)ntinual learning framework that stores information from previous tasks as a reduced graph. At each time period, this reduced graph expands by combining with a new graph and aligning shared nodes, and then it undergoes a "zoom out" process by reduction to maintain a stable size. We design a graph coarsening algorithm based on node representation proximities to efficiently reduce a graph and preserve topological information. We empirically demonstrate the learning process on the reduced graph can approximate that of the original graph. Our experiments validate the effectiveness of the proposed framework on three real-world datasets using different backbone GNN models.

A Topology-aware Graph Coarsening Framework for Continual Graph Learning

TL;DR

The paper addresses catastrophic forgetting in continual graph learning by introducing TA\mathbb{CO}, a topology-aware framework that compresses past graphs into a reduced representation to preserve topology and exploit inter-task correlations. It combines new and past graphs, then uses RePro, an embedding-proximity-based coarsening method, to produce a compact replay graph, with Node Fidelity Preservation safeguarding minority classes. The approach achieves superior performance and reduced forgetting across three real-world time-stamped datasets (Kindle, DBLP, ACM) using multiple GNN backbones, and demonstrates efficiency gains over existing baselines. By restricting memory growth and preserving structural information, TA\mathbb{CO} offers a scalable solution for continual learning on evolving graphs with practical impact in streaming graph applications.

Abstract

Continual learning on graphs tackles the problem of training a graph neural network (GNN) where graph data arrive in a streaming fashion and the model tends to forget knowledge from previous tasks when updating with new data. Traditional continual learning strategies such as Experience Replay can be adapted to streaming graphs, however, these methods often face challenges such as inefficiency in preserving graph topology and incapability of capturing the correlation between old and new tasks. To address these challenges, we propose TA, a (t)opology-(a)ware graph (co)arsening and (co)ntinual learning framework that stores information from previous tasks as a reduced graph. At each time period, this reduced graph expands by combining with a new graph and aligning shared nodes, and then it undergoes a "zoom out" process by reduction to maintain a stable size. We design a graph coarsening algorithm based on node representation proximities to efficiently reduce a graph and preserve topological information. We empirically demonstrate the learning process on the reduced graph can approximate that of the original graph. Our experiments validate the effectiveness of the proposed framework on three real-world datasets using different backbone GNN models.
Paper Structure (29 sections, 2 theorems, 42 equations, 8 figures, 10 tables, 2 algorithms)

This paper contains 29 sections, 2 theorems, 42 equations, 8 figures, 10 tables, 2 algorithms.

Key Result

Theorem 4.1

Consider $n$ nodes with $c$ classes, such that the class distribution of all nodes is represented by $\mathbf{p} =p_1, p_2, ..., p_c$, where $\sum_{i=1}^{c} p_i = 1$. If these nodes are randomly partitioned into $n'$ clusters such that $n' = \lfloor \gamma \cdot n \rfloor$, $0<\gamma<1$ and the clas

Figures (8)

  • Figure 1: A motivating example on Kindle dataset
  • Figure 2: An overview of TA$\mathbb{CO}$. At $t$-th time period, the model takes in the coarsened graph $\mathcal{G}^r_{t-1}$ from the last time period and the original graph $\mathcal{G}_{t}$ from the current time period, and combine them into $\mathcal{G}^{c}_{t}$; for the same time period, the selected important node set is updated with the new nodes; the model is then trained on $\mathcal{G}^{c}_{t}$ with both the new nodes and the super-nodes from the past; finally $\mathcal{G}^{c}_{t}$ is coarsened to $\mathcal{G}^{r}_{t}$ for the next time period.
  • Figure 3: (a) The test macro-F1 scores of the GCN model trained on the coarsened graphs with different reduction rates on three datasets. (b)-(d) t-SNE visualization of node embeddings of the DBLP test graph with a reduction rate of 0, 0.5, and 0.9 on the training graph respectively.
  • Figure 4: $\mathbb{E}[p'_1]$ and $\mathbb{E}[\frac{p'_1}{p_1}]$ against $p_1$ at different reduction rate $\gamma$ for $c=2$ and $c=3$. The dashed lines represent trends with Node Fidelity Preservation (NFP), and the solid lines represent trends without NFP.
  • Figure 5: F1 score on the test set of the first task on Kindle, DBLP, and ACM, after training on more tasks.
  • ...and 3 more figures

Theorems & Definitions (6)

  • Theorem 4.1
  • proof
  • proof
  • Theorem B.1
  • proof
  • proof