Balanced Gradient Sample Retrieval for Enhanced Knowledge Retention in Proxy-based Continual Learning
Hongye Xu, Jan Wasilewski, Bartosz Krawczyk
TL;DR
This work tackles catastrophic forgetting in continual learning by introducing a balanced sample retrieval strategy for memory buffers in a supervised contrastive framework. By combining gradient-aligned and gradient-conflicting samples, the method preserves past knowledge while stabilizing shared representations, mitigating proxy drift. The approach is supported by theoretical analysis of gradient interactions and empirical results showing state-of-the-art performance across six vision datasets, with robust improvements in retention and adaptation. The work demonstrates that balanced retrieval enhances buffer diversity and stability, offering practical gains for proxy-based continual learning in realistic data streams.
Abstract
Continual learning in deep neural networks often suffers from catastrophic forgetting, where representations for previous tasks are overwritten during subsequent training. We propose a novel sample retrieval strategy from the memory buffer that leverages both gradient-conflicting and gradient-aligned samples to effectively retain knowledge about past tasks within a supervised contrastive learning framework. Gradient-conflicting samples are selected for their potential to reduce interference by re-aligning gradients, thereby preserving past task knowledge. Meanwhile, gradient-aligned samples are incorporated to reinforce stable, shared representations across tasks. By balancing gradient correction from conflicting samples with alignment reinforcement from aligned ones, our approach increases the diversity among retrieved instances and achieves superior alignment in parameter space, significantly enhancing knowledge retention and mitigating proxy drift. Empirical results demonstrate that using both sample types outperforms methods relying solely on one sample type or random retrieval. Experiments on popular continual learning benchmarks in computer vision validate our method's state-of-the-art performance in mitigating forgetting while maintaining competitive accuracy on new tasks.
