Summarizing Stream Data for Memory-Constrained Online Continual Learning
Jianyang Gu, Kai Wang, Wei Jiang, Yang You
TL;DR
The paper addresses memory-constrained online continual learning by introducing Summarizing Stream Data (SSD), a method that distills stream information into a small set of informative memory samples. SSD aligns training gradients and past-task relationships to ensure summarized samples mimic the effects of real data, augmented by a past-assisted component to preserve distributional knowledge. The approach combines a dynamic memory, gradient matching, and relationship matching, and demonstrates notable gains across multiple benchmarks, especially at small memory budgets, with modest computational overhead. These findings show SSD as a practical enhancement to replay-based online CL, offering compatible improvements to existing baselines and a scalable path for more memory-efficient continual learning.
Abstract
Replay-based methods have proved their effectiveness on online continual learning by rehearsing past samples from an auxiliary memory. With many efforts made on improving training schemes based on the memory, however, the information carried by each sample in the memory remains under-investigated. Under circumstances with restricted storage space, the informativeness of the memory becomes critical for effective replay. Although some works design specific strategies to select representative samples, by only employing a small number of original images, the storage space is still not well utilized. To this end, we propose to Summarize the knowledge from the Stream Data (SSD) into more informative samples by distilling the training characteristics of real images. Through maintaining the consistency of training gradients and relationship to the past tasks, the summarized samples are more representative for the stream data compared to the original images. Extensive experiments are conducted on multiple online continual learning benchmarks to support that the proposed SSD method significantly enhances the replay effects. We demonstrate that with limited extra computational overhead, SSD provides more than 3% accuracy boost for sequential CIFAR-100 under extremely restricted memory buffer. Code in https://github.com/vimar-gu/SSD.
