Table of Contents
Fetching ...

Continual Learning with Diffusion-based Generative Replay for Industrial Streaming Data

Jiayi He, Jiao Chen, Qianmiao Liu, Suyan Dai, Jianhua Tang, Dongpo Liu

TL;DR

The paper tackles data drift in industrial streaming data and the resulting catastrophic forgetting under resource constraints in IIoT. It introduces Distillation-based Self-Guidance (DSG), a continual-learning framework that couples a diffusion-based replay generator with distillation between sequential generators to improve replay data quality and knowledge retention, optimizing a combined objective that includes current-task loss and distillation. Empirical results on CWRU, DSA, and WISDM show DSG yields consistent accuracy gains of about 2.9%–5.0% over an experience-replay baseline, while providing insights into the dynamics of forgetting and sample fidelity. The proposed approach offers practical implications for industrial deployment, including edge-cloud collaboration, by enabling robust learning from evolving streaming data with limited resources.

Abstract

The Industrial Internet of Things (IIoT) integrates interconnected sensors and devices to support industrial applications, but its dynamic environments pose challenges related to data drift. Considering the limited resources and the need to effectively adapt models to new data distributions, this paper introduces a Continual Learning (CL) approach, i.e., Distillation-based Self-Guidance (DSG), to address challenges presented by industrial streaming data via a novel generative replay mechanism. DSG utilizes knowledge distillation to transfer knowledge from the previous diffusion-based generator to the updated one, improving both the stability of the generator and the quality of reproduced data, thereby enhancing the mitigation of catastrophic forgetting. Experimental results on CWRU, DSA, and WISDM datasets demonstrate the effectiveness of DSG. DSG outperforms the state-of-the-art baseline in accuracy, demonstrating improvements ranging from 2.9% to 5.0% on key datasets, showcasing its potential for practical industrial applications.

Continual Learning with Diffusion-based Generative Replay for Industrial Streaming Data

TL;DR

The paper tackles data drift in industrial streaming data and the resulting catastrophic forgetting under resource constraints in IIoT. It introduces Distillation-based Self-Guidance (DSG), a continual-learning framework that couples a diffusion-based replay generator with distillation between sequential generators to improve replay data quality and knowledge retention, optimizing a combined objective that includes current-task loss and distillation. Empirical results on CWRU, DSA, and WISDM show DSG yields consistent accuracy gains of about 2.9%–5.0% over an experience-replay baseline, while providing insights into the dynamics of forgetting and sample fidelity. The proposed approach offers practical implications for industrial deployment, including edge-cloud collaboration, by enabling robust learning from evolving streaming data with limited resources.

Abstract

The Industrial Internet of Things (IIoT) integrates interconnected sensors and devices to support industrial applications, but its dynamic environments pose challenges related to data drift. Considering the limited resources and the need to effectively adapt models to new data distributions, this paper introduces a Continual Learning (CL) approach, i.e., Distillation-based Self-Guidance (DSG), to address challenges presented by industrial streaming data via a novel generative replay mechanism. DSG utilizes knowledge distillation to transfer knowledge from the previous diffusion-based generator to the updated one, improving both the stability of the generator and the quality of reproduced data, thereby enhancing the mitigation of catastrophic forgetting. Experimental results on CWRU, DSA, and WISDM datasets demonstrate the effectiveness of DSG. DSG outperforms the state-of-the-art baseline in accuracy, demonstrating improvements ranging from 2.9% to 5.0% on key datasets, showcasing its potential for practical industrial applications.
Paper Structure (21 sections, 15 equations, 5 figures, 3 tables, 1 algorithm)

This paper contains 21 sections, 15 equations, 5 figures, 3 tables, 1 algorithm.

Figures (5)

  • Figure 1: Continual learning of the bearing fault diagnosis model. With streaming data from different operating conditions, the model is required to be updated to cope with current data while mitigating the forgetting of previous learning.
  • Figure 2: Comparison of traditional generative replay and our proposed DSG.
  • Figure 3: Evolution of average accuracy of various methods.
  • Figure 4: Comparison of raw and generative samples.
  • Figure 5: Comparing DSG with the baselines among continual tasks using the CWRU dataset.