Table of Contents
Fetching ...

Regularization-Based Efficient Continual Learning in Deep State-Space Models

Yuanhang Zhang, Zhidi Lin, Yiyong Sun, Feng Yin, Carsten Fritsche

TL;DR

This work addresses catastrophic forgetting in deep state-space models by integrating regularization-based continual learning (CL) methods into deep state-space models (DSSMs), yielding continual learning DSSMs (CLDSSMs) that maintain constant memory and computation across expanding task sequences. The approach combines an autodifferentiable Ensemble Kalman Filter (EnKF) framework for efficient learning of latent dynamics with four regularization schemes—Elastic Weight Consolidation, Memory Aware Synapses, Synaptic Intelligence, and Learning without Forgetting—to preserve prior task knowledge while adapting to new tasks. Experiments on real-world power and weather datasets show that CLDSSMs consistently mitigate forgetting and can accelerate learning for new tasks, with LwF often delivering the best forecasting accuracy and EWC/MAS offering favorable memory/computation profiles. Overall, CLDSSMs provide a scalable, task-efficient solution for continual learning in dynamic system modeling, enabling robust multi-task forecasting in resource-constrained settings.

Abstract

Deep state-space models (DSSMs) have gained popularity in recent years due to their potent modeling capacity for dynamic systems. However, existing DSSM works are limited to single-task modeling, which requires retraining with historical task data upon revisiting a forepassed task. To address this limitation, we propose continual learning DSSMs (CLDSSMs), which are capable of adapting to evolving tasks without catastrophic forgetting. Our proposed CLDSSMs integrate mainstream regularization-based continual learning (CL) methods, ensuring efficient updates with constant computational and memory costs for modeling multiple dynamic systems. We also conduct a comprehensive cost analysis of each CL method applied to the respective CLDSSMs, and demonstrate the efficacy of CLDSSMs through experiments on real-world datasets. The results corroborate that while various competing CL methods exhibit different merits, the proposed CLDSSMs consistently outperform traditional DSSMs in terms of effectively addressing catastrophic forgetting, enabling swift and accurate parameter transfer to new tasks.

Regularization-Based Efficient Continual Learning in Deep State-Space Models

TL;DR

This work addresses catastrophic forgetting in deep state-space models by integrating regularization-based continual learning (CL) methods into deep state-space models (DSSMs), yielding continual learning DSSMs (CLDSSMs) that maintain constant memory and computation across expanding task sequences. The approach combines an autodifferentiable Ensemble Kalman Filter (EnKF) framework for efficient learning of latent dynamics with four regularization schemes—Elastic Weight Consolidation, Memory Aware Synapses, Synaptic Intelligence, and Learning without Forgetting—to preserve prior task knowledge while adapting to new tasks. Experiments on real-world power and weather datasets show that CLDSSMs consistently mitigate forgetting and can accelerate learning for new tasks, with LwF often delivering the best forecasting accuracy and EWC/MAS offering favorable memory/computation profiles. Overall, CLDSSMs provide a scalable, task-efficient solution for continual learning in dynamic system modeling, enabling robust multi-task forecasting in resource-constrained settings.

Abstract

Deep state-space models (DSSMs) have gained popularity in recent years due to their potent modeling capacity for dynamic systems. However, existing DSSM works are limited to single-task modeling, which requires retraining with historical task data upon revisiting a forepassed task. To address this limitation, we propose continual learning DSSMs (CLDSSMs), which are capable of adapting to evolving tasks without catastrophic forgetting. Our proposed CLDSSMs integrate mainstream regularization-based continual learning (CL) methods, ensuring efficient updates with constant computational and memory costs for modeling multiple dynamic systems. We also conduct a comprehensive cost analysis of each CL method applied to the respective CLDSSMs, and demonstrate the efficacy of CLDSSMs through experiments on real-world datasets. The results corroborate that while various competing CL methods exhibit different merits, the proposed CLDSSMs consistently outperform traditional DSSMs in terms of effectively addressing catastrophic forgetting, enabling swift and accurate parameter transfer to new tasks.
Paper Structure (16 sections, 14 equations, 14 figures, 2 tables)

This paper contains 16 sections, 14 equations, 14 figures, 2 tables.

Figures (14)

  • Figure 1: Graphical representation of an SSM.
  • Figure 2: Illustration of continual learning in DSSMs
  • Figure 3: DSSM-first task only
  • Figure 4: DSSM
  • Figure 5: CLDSSM-EWC
  • ...and 9 more figures