Table of Contents
Fetching ...

PTMs-TSCIL Pre-Trained Models Based Class-Incremental Learning

Yuanlong Wu, Mingxing Nie, Tao Zhu, Liming Chen, Huansheng Ning, Yaping Wan

TL;DR

This work tackles time series class-incremental learning (TSCIL) under restricted data access by leveraging pre-trained time-series models (PTMs) in a parameter-efficient fashion. The authors freeze the PTM backbone and progressively tune a shared adapter while using knowledge distillation to curb overfitting and a Drift Compensation Network (DCN) to model and correct feature drift between old and new task representations. A three-stage optimization framework combines drift correction, adapter KD, and prototype-based classifier retraining to maintain stability while enabling plasticity, achieving state-of-the-art results on five real-world datasets without exemplar storage. The approach provides a practical, scalable paradigm for non-exemplar continual learning in time-series domains and highlights the potential of large TS PTMs to enhance TSCIL performance.

Abstract

Class-incremental learning (CIL) for time series data faces critical challenges in balancing stability against catastrophic forgetting and plasticity for new knowledge acquisition, particularly under real-world constraints where historical data access is restricted. While pre-trained models (PTMs) have shown promise in CIL for vision and NLP domains, their potential in time series class-incremental learning (TSCIL) remains underexplored due to the scarcity of large-scale time series pre-trained models. Prompted by the recent emergence of large-scale pre-trained models (PTMs) for time series data, we present the first exploration of PTM-based Time Series Class-Incremental Learning (TSCIL). Our approach leverages frozen PTM backbones coupled with incrementally tuning the shared adapter, preserving generalization capabilities while mitigating feature drift through knowledge distillation. Furthermore, we introduce a Feature Drift Compensation Network (DCN), designed with a novel two-stage training strategy to precisely model feature space transformations across incremental tasks. This allows for accurate projection of old class prototypes into the new feature space. By employing DCN-corrected prototypes, we effectively enhance the unified classifier retraining, mitigating model feature drift and alleviating catastrophic forgetting. Extensive experiments on five real-world datasets demonstrate state-of-the-art performance, with our method yielding final accuracy gains of 1.4%-6.1% across all datasets compared to existing PTM-based approaches. Our work establishes a new paradigm for TSCIL, providing insights into stability-plasticity optimization for continual learning systems.

PTMs-TSCIL Pre-Trained Models Based Class-Incremental Learning

TL;DR

This work tackles time series class-incremental learning (TSCIL) under restricted data access by leveraging pre-trained time-series models (PTMs) in a parameter-efficient fashion. The authors freeze the PTM backbone and progressively tune a shared adapter while using knowledge distillation to curb overfitting and a Drift Compensation Network (DCN) to model and correct feature drift between old and new task representations. A three-stage optimization framework combines drift correction, adapter KD, and prototype-based classifier retraining to maintain stability while enabling plasticity, achieving state-of-the-art results on five real-world datasets without exemplar storage. The approach provides a practical, scalable paradigm for non-exemplar continual learning in time-series domains and highlights the potential of large TS PTMs to enhance TSCIL performance.

Abstract

Class-incremental learning (CIL) for time series data faces critical challenges in balancing stability against catastrophic forgetting and plasticity for new knowledge acquisition, particularly under real-world constraints where historical data access is restricted. While pre-trained models (PTMs) have shown promise in CIL for vision and NLP domains, their potential in time series class-incremental learning (TSCIL) remains underexplored due to the scarcity of large-scale time series pre-trained models. Prompted by the recent emergence of large-scale pre-trained models (PTMs) for time series data, we present the first exploration of PTM-based Time Series Class-Incremental Learning (TSCIL). Our approach leverages frozen PTM backbones coupled with incrementally tuning the shared adapter, preserving generalization capabilities while mitigating feature drift through knowledge distillation. Furthermore, we introduce a Feature Drift Compensation Network (DCN), designed with a novel two-stage training strategy to precisely model feature space transformations across incremental tasks. This allows for accurate projection of old class prototypes into the new feature space. By employing DCN-corrected prototypes, we effectively enhance the unified classifier retraining, mitigating model feature drift and alleviating catastrophic forgetting. Extensive experiments on five real-world datasets demonstrate state-of-the-art performance, with our method yielding final accuracy gains of 1.4%-6.1% across all datasets compared to existing PTM-based approaches. Our work establishes a new paradigm for TSCIL, providing insights into stability-plasticity optimization for continual learning systems.

Paper Structure

This paper contains 33 sections, 10 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: TSCIL process schematic diagram and several representative method structure diagrams.
  • Figure 2: TSCIL process schematic diagram and several representative method structure diagrams.
  • Figure 3: The framework of our proposed method. Left:The structure diagram of the Moment, Adapter, and local classification head. Right:The training process comprises these steps: (I) In task $t$, the Adapter and local classifier are trained, and the Drift Compensation Network (DCN) is preliminarily trained. (II) The DCN is further trained separately by leveraging feature drift between the old and new models. (III) Drift compensation is applied to the old class prototypes, and new class prototypes are extracted and stored. Subsequently, samples are generated using these prototypes, and the classification head is retrained.
  • Figure 4: Evolution of Average Accuracy ($\mathcal{A}_i$) is shown. Traditional CIL methods use circular markers, while PTM-based methods use triangular markers. Since Joint represents joint-train across the entire task sequence, its result is shown as a single point rather than a curve.
  • Figure 5: Comparison of the L2 Distance Between Updated Prototypes and Real Prototypes Across Tasks for Different Strategies.
  • ...and 1 more figures