Table of Contents
Fetching ...

Online Continual Graph Learning

Giovanni Donghi, Luca Pasa, Daniele Zambon, Cesare Alippi, Nicolò Navarin

TL;DR

This work defines Online Continual Graph Learning (OCGL), a framework for node-level learning on evolving graphs under strict memory and computation budgets with anytime-inference needs. It formalizes the problem, analyzes neighborhood expansion, and proposes simple neighborhood-sampling strategies alongside a minimal but effective baseline, LINEAR. A comprehensive benchmark across seven datasets and nine adapted CL strategies reveals replay-based methods generally excel, with LINEAR offering a fast, robust alternative. The study lays a foundation for systematic progress in OCGL and highlights directions such as addressing neighborhood drift, richer streams, and broader tasks like link prediction.

Abstract

Continual Learning (CL) aims to incrementally acquire new knowledge while mitigating catastrophic forgetting. Within this setting, Online Continual Learning (OCL) focuses on updating models promptly and incrementally from single or small batches of observations from a data stream. Extending OCL to graph-structured data is crucial, as many real-world networks evolve over time and require timely, online predictions. However, existing continual or streaming graph learning methods typically assume access to entire graph snapshots or multiple passes over tasks, violating the efficiency constraints of the online setting. To address this gap, we introduce the Online Continual Graph Learning (OCGL) setting, which formalizes node-level continual learning on evolving graphs under strict memory and computational budgets. OCGL defines how a model incrementally processes a stream of node-level information while maintaining anytime inference and respecting resource constraints. We further establish a comprehensive benchmark comprising seven datasets and nine CL strategies, suitably adapted to the OCGL setting, enabling a standardized evaluation setup. Finally, we present a minimalistic yet competitive baseline for OCGL, inspired by our benchmarking results, that achieves strong empirical performance with high efficiency.

Online Continual Graph Learning

TL;DR

This work defines Online Continual Graph Learning (OCGL), a framework for node-level learning on evolving graphs under strict memory and computation budgets with anytime-inference needs. It formalizes the problem, analyzes neighborhood expansion, and proposes simple neighborhood-sampling strategies alongside a minimal but effective baseline, LINEAR. A comprehensive benchmark across seven datasets and nine adapted CL strategies reveals replay-based methods generally excel, with LINEAR offering a fast, robust alternative. The study lays a foundation for systematic progress in OCGL and highlights directions such as addressing neighborhood drift, richer streams, and broader tasks like link prediction.

Abstract

Continual Learning (CL) aims to incrementally acquire new knowledge while mitigating catastrophic forgetting. Within this setting, Online Continual Learning (OCL) focuses on updating models promptly and incrementally from single or small batches of observations from a data stream. Extending OCL to graph-structured data is crucial, as many real-world networks evolve over time and require timely, online predictions. However, existing continual or streaming graph learning methods typically assume access to entire graph snapshots or multiple passes over tasks, violating the efficiency constraints of the online setting. To address this gap, we introduce the Online Continual Graph Learning (OCGL) setting, which formalizes node-level continual learning on evolving graphs under strict memory and computational budgets. OCGL defines how a model incrementally processes a stream of node-level information while maintaining anytime inference and respecting resource constraints. We further establish a comprehensive benchmark comprising seven datasets and nine CL strategies, suitably adapted to the OCGL setting, enabling a standardized evaluation setup. Finally, we present a minimalistic yet competitive baseline for OCGL, inspired by our benchmarking results, that achieves strong empirical performance with high efficiency.

Paper Structure

This paper contains 39 sections, 22 figures, 18 tables.

Figures (22)

  • Figure 1: Illustration of the different graph evolution under CGL and OCGL. Top left: in CGL task subgraphs are incrementally attached to the existing graph (and training is performed offline until convergence on the subgraphs). Bottom left: in OCGL individual nodes are attached to the graph in order of their arrival (and training is performed online in one pass on individual or mini-batches of nodes). Right: the size of the observed 2-hop neighborhood of a node is kept bounded by using neighborhood sampling. In the example, 2 neighbors of $v_t$ are sampled, and then recursively 2 neighbors for each of them.
  • Figure 2: Number of nodes in the union of $l$-hop neighborhoods of each training batch. Smoothed with rolling average over windows of 10 batches for readability, maximum is reported in the legend.
  • Figure 3: Anytime evaluation on the CoraFull and Reddit datasets: for each method, the line shows average accuracy measured after each training batch. We also indicate with vertical dotted lines the task boundaries, highlighting the validation boundary, and report the upper bound of jointly training up to the current task. We remark that it is natural and expected that accuracy tends to decrease with the batch index, as new classes are introduced and the classification task gets increasingly complex. Similar plots for all datasets are provided in Appendix \ref{['app:plots']}.
  • Figure 4: Anytime evaluation by task: a breakdown of model performance at the end of each training batch for three selected methods on the CoraFull dataset. Similar plots for all datasets and CL methods are provided in Appendix \ref{['app:plots']}.
  • Figure 5: Results (Average Anytime Performance) with different buffer sizes for replay methods across the datasets.
  • ...and 17 more figures