Table of Contents
Fetching ...

Summarizing Stream Data for Memory-Constrained Online Continual Learning

Jianyang Gu, Kai Wang, Wei Jiang, Yang You

TL;DR

The paper addresses memory-constrained online continual learning by introducing Summarizing Stream Data (SSD), a method that distills stream information into a small set of informative memory samples. SSD aligns training gradients and past-task relationships to ensure summarized samples mimic the effects of real data, augmented by a past-assisted component to preserve distributional knowledge. The approach combines a dynamic memory, gradient matching, and relationship matching, and demonstrates notable gains across multiple benchmarks, especially at small memory budgets, with modest computational overhead. These findings show SSD as a practical enhancement to replay-based online CL, offering compatible improvements to existing baselines and a scalable path for more memory-efficient continual learning.

Abstract

Replay-based methods have proved their effectiveness on online continual learning by rehearsing past samples from an auxiliary memory. With many efforts made on improving training schemes based on the memory, however, the information carried by each sample in the memory remains under-investigated. Under circumstances with restricted storage space, the informativeness of the memory becomes critical for effective replay. Although some works design specific strategies to select representative samples, by only employing a small number of original images, the storage space is still not well utilized. To this end, we propose to Summarize the knowledge from the Stream Data (SSD) into more informative samples by distilling the training characteristics of real images. Through maintaining the consistency of training gradients and relationship to the past tasks, the summarized samples are more representative for the stream data compared to the original images. Extensive experiments are conducted on multiple online continual learning benchmarks to support that the proposed SSD method significantly enhances the replay effects. We demonstrate that with limited extra computational overhead, SSD provides more than 3% accuracy boost for sequential CIFAR-100 under extremely restricted memory buffer. Code in https://github.com/vimar-gu/SSD.

Summarizing Stream Data for Memory-Constrained Online Continual Learning

TL;DR

The paper addresses memory-constrained online continual learning by introducing Summarizing Stream Data (SSD), a method that distills stream information into a small set of informative memory samples. SSD aligns training gradients and past-task relationships to ensure summarized samples mimic the effects of real data, augmented by a past-assisted component to preserve distributional knowledge. The approach combines a dynamic memory, gradient matching, and relationship matching, and demonstrates notable gains across multiple benchmarks, especially at small memory budgets, with modest computational overhead. These findings show SSD as a practical enhancement to replay-based online CL, offering compatible improvements to existing baselines and a scalable path for more memory-efficient continual learning.

Abstract

Replay-based methods have proved their effectiveness on online continual learning by rehearsing past samples from an auxiliary memory. With many efforts made on improving training schemes based on the memory, however, the information carried by each sample in the memory remains under-investigated. Under circumstances with restricted storage space, the informativeness of the memory becomes critical for effective replay. Although some works design specific strategies to select representative samples, by only employing a small number of original images, the storage space is still not well utilized. To this end, we propose to Summarize the knowledge from the Stream Data (SSD) into more informative samples by distilling the training characteristics of real images. Through maintaining the consistency of training gradients and relationship to the past tasks, the summarized samples are more representative for the stream data compared to the original images. Extensive experiments are conducted on multiple online continual learning benchmarks to support that the proposed SSD method significantly enhances the replay effects. We demonstrate that with limited extra computational overhead, SSD provides more than 3% accuracy boost for sequential CIFAR-100 under extremely restricted memory buffer. Code in https://github.com/vimar-gu/SSD.
Paper Structure (39 sections, 8 equations, 8 figures, 7 tables, 1 algorithm)

This paper contains 39 sections, 8 equations, 8 figures, 7 tables, 1 algorithm.

Figures (8)

  • Figure 1: Under the restricted memory size of 100, the information contained in the memory is rather limited. By integrating the information from stream data into summarized samples, our proposed SSD largely enhances the replay effects. Experiments are conducted on the sequential CIFAR-100 benchmark (10 tasks). The scatter point size represents the GPU memory consumption at the training process.
  • Figure 2: Concept Comparison: (a) Previous CL methods select original images from the stream data to form the auxiliary memory. (b) We propose to summarize stream data into informative samples to enhance the replay effects.
  • Figure 3: The pipeline of our proposed Summarizing Stream Data (SSD) method. The memory is composed of the summarized samples of previous tasks (Past Summarized Samples), the samples of the current task that are being summarized (Current Summarizing Samples) and real samples. For the current summarizing samples, both the training gradients and the relationships to the past knowledge are constrained to be consistent with real images.
  • Figure 4: (a) The experiment results of increasing the memory size. (b) The computational cost comparison. (c) The parameter analysis on the relationship matching coefficient $\gamma$. (d) The parameter analysis on the summarizing interval $\tau$.
  • Figure 5: Visualizations of original images (left), data stream (middle) and summarized images (right) on CIFAR-100. The information of color, structure and texture are integrated into the summarized images, helping improve the replay effects. Best viewed in color.
  • ...and 3 more figures