Table of Contents
Fetching ...

An Efficient Memory Module for Graph Few-Shot Class-Incremental Learning

Dong Li, Aijia Zhang, Junqi Gao, Biqing Qi

TL;DR

This work introduces Mecoin, an efficient method for building and maintaining memory for graph representation learning and analyzes the effectiveness of Mecoin in terms of generalization error and the impact of different distillation strategies on model performance through experiments and VC-dimension analysis.

Abstract

Incremental graph learning has gained significant attention for its ability to address the catastrophic forgetting problem in graph representation learning. However, traditional methods often rely on a large number of labels for node classification, which is impractical in real-world applications. This makes few-shot incremental learning on graphs a pressing need. Current methods typically require extensive training samples from meta-learning to build memory and perform intensive fine-tuning of GNN parameters, leading to high memory consumption and potential loss of previously learned knowledge. To tackle these challenges, we introduce Mecoin, an efficient method for building and maintaining memory. Mecoin employs Structured Memory Units to cache prototypes of learned categories, as well as Memory Construction Modules to update these prototypes for new categories through interactions between the nodes and the cached prototypes. Additionally, we have designed a Memory Representation Adaptation Module to store probabilities associated with each class prototype, reducing the need for parameter fine-tuning and lowering the forgetting rate. When a sample matches its corresponding class prototype, the relevant probabilities are retrieved from the MRaM. Knowledge is then distilled back into the GNN through a Graph Knowledge Distillation Module, preserving the model's memory. We analyze the effectiveness of Mecoin in terms of generalization error and explore the impact of different distillation strategies on model performance through experiments and VC-dimension analysis. Compared to other related works, Mecoin shows superior performance in accuracy and forgetting rate. Our code is publicly available on the https://github.com/Arvin0313/Mecoin-GFSCIL.git .

An Efficient Memory Module for Graph Few-Shot Class-Incremental Learning

TL;DR

This work introduces Mecoin, an efficient method for building and maintaining memory for graph representation learning and analyzes the effectiveness of Mecoin in terms of generalization error and the impact of different distillation strategies on model performance through experiments and VC-dimension analysis.

Abstract

Incremental graph learning has gained significant attention for its ability to address the catastrophic forgetting problem in graph representation learning. However, traditional methods often rely on a large number of labels for node classification, which is impractical in real-world applications. This makes few-shot incremental learning on graphs a pressing need. Current methods typically require extensive training samples from meta-learning to build memory and perform intensive fine-tuning of GNN parameters, leading to high memory consumption and potential loss of previously learned knowledge. To tackle these challenges, we introduce Mecoin, an efficient method for building and maintaining memory. Mecoin employs Structured Memory Units to cache prototypes of learned categories, as well as Memory Construction Modules to update these prototypes for new categories through interactions between the nodes and the cached prototypes. Additionally, we have designed a Memory Representation Adaptation Module to store probabilities associated with each class prototype, reducing the need for parameter fine-tuning and lowering the forgetting rate. When a sample matches its corresponding class prototype, the relevant probabilities are retrieved from the MRaM. Knowledge is then distilled back into the GNN through a Graph Knowledge Distillation Module, preserving the model's memory. We analyze the effectiveness of Mecoin in terms of generalization error and explore the impact of different distillation strategies on model performance through experiments and VC-dimension analysis. Compared to other related works, Mecoin shows superior performance in accuracy and forgetting rate. Our code is publicly available on the https://github.com/Arvin0313/Mecoin-GFSCIL.git .

Paper Structure

This paper contains 18 sections, 18 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: Overview of the Mecoin framework for GFSCIL. (a)Graph neural network: Consists of a GNN encoder and a classifier(MLP) pre-trained by GNN. In GFSCIL tasks, the encoder parameters are frozen. (b)Structured Memory Unit: Constructs class prototypes through MeCs and stores them in SMU. (c)Memory Representation Adaptive Module: Facilitates adaptive knowledge interaction with the GNN model.
  • Figure 2: The comparative analysis of the mean performance, accuracy curves and memory utilization of HAG-Meta, Geometer and Mecoin across 10 sessions on CoraFull, conducted under the experimental conditions delineated in their respective publications.
  • Figure 3: The outcomes of GKIM when conducting the few-shot continuous learning task on the CoraFull, Computers and CS datasets. The results are presented sequentially from left to right: GKIM with full capabilities, GKIM where node features do not interact with class prototypes in the SMU, GKIM without GraphInfo and GKIM without MeCs . The experimental results for CoraFull are shown in the above figure, the results for Computers are in the middle and the results for CS are in the figure below.
  • Figure 4: Left 2 columns: Line charts depict the performance of models across various sessions on the CoraFull and CS datasets when using different distillation methods; Right 2 columns: Histograms illustrate the forgetting rates of different distillation methods on these two datasets.
  • Figure 5: From left to right are the results of GKIM, without using GraphInfo, node features not interacting with class prototypes in SMU and without using MeCs , when performing the graph small-sample continuous learning task in the Computers dataset, four randomly selected categories from session1 and 400 randomly selected samples from the four categories are clustered at the class center of the class prototypes bit class centers obtained from the learning during the training process.
  • ...and 1 more figures