A Comprehensive Survey of Deep Transfer Learning for Anomaly Detection in Industrial Time Series: Methods, Applications, and Directions

Peng Yan; Ahmed Abdulkadir; Paul-Philipp Luley; Matthias Rosenthal; Gerrit A. Schatte; Benjamin F. Grewe; Thilo Stadelmann

A Comprehensive Survey of Deep Transfer Learning for Anomaly Detection in Industrial Time Series: Methods, Applications, and Directions

Peng Yan, Ahmed Abdulkadir, Paul-Philipp Luley, Matthias Rosenthal, Gerrit A. Schatte, Benjamin F. Grewe, Thilo Stadelmann

TL;DR

This survey analyzes deep transfer learning (DTL) for anomaly detection in industrial time series, framing the problem through inductive and transductive transfer and detailing four DL-TL strategies: instance, parameter, mapping, and domain-adversarial transfer. It provides an application-centric review across manufacturing process monitoring, predictive maintenance, energy management, and infrastructure monitoring, highlighting that parameter transfer dominates practice while domain shifts and data-label constraints remain critical challenges. The authors offer practical guidelines for data preprocessing, generative-AI–driven augmentation, and imbalance handling, and discuss architecture choices from CNN/LSTM baselines toward more advanced models and foundation-model ideas. They conclude that, despite current limitations, DL-TL holds substantial potential for robust, data-efficient anomaly detection in dynamic industrial settings and advocate a data-centric, integrated ML approach to realize it at scale.

Abstract

Automating the monitoring of industrial processes has the potential to enhance efficiency and optimize quality by promptly detecting abnormal events and thus facilitating timely interventions. Deep learning, with its capacity to discern non-trivial patterns within large datasets, plays a pivotal role in this process. Standard deep learning methods are suitable to solve a specific task given a specific type of data. During training, deep learning demands large volumes of labeled data. However, due to the dynamic nature of the industrial processes and environment, it is impractical to acquire large-scale labeled data for standard deep learning training for every slightly different case anew. Deep transfer learning offers a solution to this problem. By leveraging knowledge from related tasks and accounting for variations in data distributions, the transfer learning framework solves new tasks with little or even no additional labeled data. The approach bypasses the need to retrain a model from scratch for every new setup and dramatically reduces the labeled data requirement. This survey first provides an in-depth review of deep transfer learning, examining the problem settings of transfer learning and classifying the prevailing deep transfer learning methods. Moreover, we delve into applications of deep transfer learning in the context of a broad spectrum of time series anomaly detection tasks prevalent in primary industrial domains, e.g., manufacturing process monitoring, predictive maintenance, energy management, and infrastructure facility monitoring. We discuss the challenges and limitations of deep transfer learning in industrial contexts and conclude the survey with practical directions and actionable suggestions to address the need to leverage diverse time series data for anomaly detection in an increasingly dynamic production environment.

A Comprehensive Survey of Deep Transfer Learning for Anomaly Detection in Industrial Time Series: Methods, Applications, and Directions

TL;DR

Abstract

Paper Structure (48 sections, 7 figures, 2 tables)

This paper contains 48 sections, 7 figures, 2 tables.

Introduction
motivation and contribution
Survey methodology
Deep transfer learning
Overview of the field
Formal description of deep transfer learning
Deep transfer learning approaches
Instance transfer
Parameter transfer
Mapping transfer
Domain-adversarial transfer
Related learning paradigms
Time series anomaly detection in industry
Anomaly types
Challenges
...and 33 more sections

Figures (7)

Figure 1: Transfer learning is useful when changes in production take place and sufficient data for full retraining is not available as shown here for a hypothetical production of two types of gears. In the production of gear A, a lot of data is available to train a deep learning model that helps improve production. In the production of gear B, data is more limited, and the traditionally trained deep learning model fails to improve production. With suitable transfer learning methods, however, data and algorithms acquired during the production of gear A can be leveraged to support improving the production of gear B because the data and tasks in the production of both gears are related.
Figure 2: Venn diagram of this survey's focus on the intersection of transfer learning, anomaly detection, and time series analysis.
Figure 3: A generic taxonomy in this paper to analyze deep transfer learning for industrial time series anomaly detection.
Figure 4: Taxonomy of transfer learning problem settings (left; see Section \ref{['sec:def']} for the definition of terms) and corresponding examples using deep transfer learning approaches (right). On the left, we classify transfer learning problems as inductive or transductive transfer settings Correspondingly, we provide two examples using deep transfer learning methods: In the inductive transfer setting, we collect time series data from screw production and wrench production. Labeled screw data (A1) is used to detect collective anomalies (a set of data points behaving differently compared to the entire time series choi2021deepchandola2009anomaly, further explained in Section \ref{['sec_tsanomalydetection']}). Then, parameter transfer (Section \ref{['sec_parameter_tl']}) is applied to transfer knowledge by fine-tuning the pre-trained model from labeled screw data to detect point anomalies (further explained in Section \ref{['sec_tsanomalydetection']}) on labeled wrench data (A2). For the transductive transfer setting in the lower panel, we present a different situation for contextual anomaly detection (further explained in Section \ref{['sec_tsanomalydetection']}). In this case, we have two datasets, B1 and B2, analyzed using the same model. However, the data in B2 significantly differs in appearance from the data in B1. To address this problem, instance transfer (further explained in Section \ref{['sec_instance_tl']}) is used. Through this learning process, the data in B2 is transformed in a way that makes it compatible with the model that has been trained exclusively on data from B1. Transfer learning, in this case, is thus achieved by adapting the data to fit the model through domain adaptation rather than adjusting the model to fit new data.
Figure 5: Three time series anomaly types. Gray lines represent recorded time series signals, and dashed green lines are a priori set thresholds of normal operations. The red dots and the red line indicate anomalies. Point anomalies are single values that fall outside of a pre-set range (left panel). Contextual anomalies are samples that deviate from the current context (middle panel). Collective anomalies are defined as a series of data points that all fall within the range of operation but jointly are not expected (right panel).
...and 2 more figures

A Comprehensive Survey of Deep Transfer Learning for Anomaly Detection in Industrial Time Series: Methods, Applications, and Directions

TL;DR

Abstract

A Comprehensive Survey of Deep Transfer Learning for Anomaly Detection in Industrial Time Series: Methods, Applications, and Directions

Authors

TL;DR

Abstract

Table of Contents

Figures (7)