What is the role of memorization in Continual Learning?

Jędrzej Kozal; Jan Wasilewski; Alif Ashrafee; Bartosz Krawczyk; Michał Woźniak

What is the role of memorization in Continual Learning?

Jędrzej Kozal, Jan Wasilewski, Alif Ashrafee, Bartosz Krawczyk, Michał Woźniak

TL;DR

This work investigates the role of memorization in continual learning, distinguishing memorization from forgetting and introducing a computable memorization score and a cheaper training-time proxy. It shows that increasing the number of classes elevates memorization and that high-memorization samples are more prone to forgetting under distribution shifts, while memorization is still necessary for high performance. The authors propose Memorization-aware Experience Replay to leverage memorization during incremental training and demonstrate, across standard CL benchmarks and larger buffers, that memory-aware strategies yield improvements, especially when memory capacity is flexible. The study highlights implications for CL benchmark design and outlines future directions to localize memory-encoding components in networks and to develop robust incremental memorization measures.

Abstract

Memorization impacts the performance of deep learning algorithms. Prior works have studied memorization primarily in the context of generalization and privacy. This work studies the memorization effect on incremental learning scenarios. Forgetting prevention and memorization seem similar. However, one should discuss their differences. We designed extensive experiments to evaluate the impact of memorization on continual learning. We clarified that learning examples with high memorization scores are forgotten faster than regular samples. Our findings also indicated that memorization is necessary to achieve the highest performance. However, at low memory regimes, forgetting regular samples is more important. We showed that the importance of a high-memorization score sample rises with an increase in the buffer size. We introduced a memorization proxy and employed it in the buffer policy problem to showcase how memorization could be used during incremental training. We demonstrated that including samples with a higher proxy memorization score is beneficial when the buffer size is large.

What is the role of memorization in Continual Learning?

TL;DR

Abstract

What is the role of memorization in Continual Learning?

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (13)