Understanding Continual Learning Settings with Data Distribution Drift Analysis
Timothée Lesort, Massimo Caccia, Irina Rish
TL;DR
This paper reframes continual learning through a drift-centric lens by introducing a hidden context variable $C$ to capture non-stationarity in data distributions. It distinguishes drift types—Real Concept Drift ($P(y|x)$ changes with fixed $P(x)$), Virtual Drift ($P(x)$ changes with fixed $P(y|x)$), and subtypes such as Virtual Concept Drift and Domain Drift—and discusses criterion drift where the learning objective itself changes. It analyzes evaluation protocols, proposing two main paradigms: online cumulative performance and current/final performance, and surveys assumptions about drifts (stationarities, patterns, and intensity) to tailor learning strategies. The paper then delineates three real-life CL scenarios—Incremental Learning, Lifelong Learning, and Learning under real-concept drifts—providing formalizations and examples, and discusses the implications for benchmarks, supervision, and meta-learning. Overall, the work offers a unified framework to characterize CL problems by distribution drifts, aiming to improve benchmark design and facilitate transfer to real-world settings.
Abstract
Classical machine learning algorithms often assume that the data are drawn i.i.d. from a stationary probability distribution. Recently, continual learning emerged as a rapidly growing area of machine learning where this assumption is relaxed, i.e. where the data distribution is non-stationary and changes over time. This paper represents the state of data distribution by a context variable $c$. A drift in $c$ leads to a data distribution drift. A context drift may change the target distribution, the input distribution, or both. Moreover, distribution drifts might be abrupt or gradual. In continual learning, context drifts may interfere with the learning process and erase previously learned knowledge; thus, continual learning algorithms must include specialized mechanisms to deal with such drifts. In this paper, we aim to identify and categorize different types of context drifts and potential assumptions about them, to better characterize various continual-learning scenarios. Moreover, we propose to use the distribution drift framework to provide more precise definitions of several terms commonly used in the continual learning field.
