Table of Contents
Fetching ...

Unlearning-based sliding window for continual learning under concept drift

Michal Wozniak, Marek Klonowski, Maciej Maczynski, Bartosz Krawczyk

Abstract

Traditional machine learning assumes a stationary data distribution, yet many real-world applications operate on nonstationary streams in which the underlying concept evolves over time. This problem can also be viewed as task-free continual learning under concept drift, where a model must adapt sequentially without explicit task identities or task boundaries. In such settings, effective learning requires both rapid adaptation to new data and forgetting of outdated information. A common solution is based on a sliding window, but this approach is often computationally demanding because the model must be repeatedly retrained from scratch on the most recent data. We propose a different perspective based on machine unlearning. Instead of rebuilding the model each time the active window changes, we remove the influence of outdated samples using unlearning and then update the model with newly observed data. This enables efficient, targeted forgetting while preserving adaptation to evolving distributions. To the best of our knowledge, this is the first work to connect machine unlearning with concept drift mitigation for task-free continual learning. Empirical results on image stream classification across multiple drift scenarios demonstrate that the proposed approach offers a competitive and computationally efficient alternative to standard sliding-window retraining. Our implementation can be found at \hrehttps://anonymous.4open.science/r/MUNDataStream-60F3}{https://anonymous.4open.science/r/MUNDataStream-60F3}.

Unlearning-based sliding window for continual learning under concept drift

Abstract

Traditional machine learning assumes a stationary data distribution, yet many real-world applications operate on nonstationary streams in which the underlying concept evolves over time. This problem can also be viewed as task-free continual learning under concept drift, where a model must adapt sequentially without explicit task identities or task boundaries. In such settings, effective learning requires both rapid adaptation to new data and forgetting of outdated information. A common solution is based on a sliding window, but this approach is often computationally demanding because the model must be repeatedly retrained from scratch on the most recent data. We propose a different perspective based on machine unlearning. Instead of rebuilding the model each time the active window changes, we remove the influence of outdated samples using unlearning and then update the model with newly observed data. This enables efficient, targeted forgetting while preserving adaptation to evolving distributions. To the best of our knowledge, this is the first work to connect machine unlearning with concept drift mitigation for task-free continual learning. Empirical results on image stream classification across multiple drift scenarios demonstrate that the proposed approach offers a competitive and computationally efficient alternative to standard sliding-window retraining. Our implementation can be found at \hrehttps://anonymous.4open.science/r/MUNDataStream-60F3}{https://anonymous.4open.science/r/MUNDataStream-60F3}.
Paper Structure (17 sections, 1 theorem, 20 equations, 4 figures, 1 table, 2 algorithms)

This paper contains 17 sections, 1 theorem, 20 equations, 4 figures, 1 table, 2 algorithms.

Key Result

Theorem 1

Let $\mathbb{E}_{z \sim P_{new}}[\ell(z, \theta)]$ be the expected loss on the new distribution $P_{new}$ for a model $\theta$. After $L$ steps, the window consists entirely of new data ($W_L = (D_{L+1} \dots D_{2L})\text{ and } D_i \sim P_{new} \text{ for } i =L+1, \ldots 2L.$). We can observe that

Figures (4)

  • Figure 1: The idea of UIL (Unlearned and Iteratively trained cLassifier)
  • Figure 2: The proposed experimental pipeline.
  • Figure 3: Comparison of SW and UIL on semantic Fashion MNIST data stream: a, c, e) [0, 2, 4, 7] → [1, 3, 5], and b, d, f) [0, 2, 4, 7] → [1, 3, 5]
  • Figure 4: Comparison of SW and UIL on sudden MNIST data stream: a, c, e) noise level 0.0 → 0.5, and b, d, f) 0.5 → 0.0.

Theorems & Definitions (2)

  • Theorem 1
  • proof