Learning Paradigms and Modelling Methodologies for Digital Twins in Process Industry
Michael Mayr, Georgios C. Chasparis, Josef Küng
TL;DR
The paper addresses how Digital Twins in the process industry are built using diverse modelling methodologies and learning paradigms. It systematically reviews literature to map modelling methods (e.g., CNNs, AEs, PINNs), learning strategies (supervised, unsupervised, self-supervised, transfer learning), and task categories (classification, regression, clustering). The findings show dominance of CNN/AE approaches and supervised learning, with growing interest in hybrid physics-data models; self-supervised and transfer learning remain underexplored yet promising, and transformer-like architectures are not yet prevalent in industrial DTs. The study highlights the potential of pretraining and cross-task transfer to enable scalable, generalizable DTs in high-volume IIoT environments and points to future research directions and industry adoption.
Abstract
Central to the digital transformation of the process industry are Digital Twins (DTs), virtual replicas of physical manufacturing systems that combine sensor data with sophisticated data-based or physics-based models, or a combination thereof, to tackle a variety of industrial-relevant tasks like process monitoring, predictive control or decision support. The backbone of a DT, i.e. the concrete modelling methodologies and architectural frameworks supporting these models, are complex, diverse and evolve fast, necessitating a thorough understanding of the latest state-of-the-art methods and trends to stay on top of a highly competitive market. From a research perspective, despite the high research interest in reviewing various aspects of DTs, structured literature reports specifically focusing on unravelling the utilized learning paradigms (e.g. self-supervised learning) for DT-creation in the process industry are a novel contribution in this field. This study aims to address these gaps by (1) systematically analyzing the modelling methodologies (e.g. Convolutional Neural Network, Encoder-Decoder, Hidden Markov Model) and paradigms (e.g. data-driven, physics-based, hybrid) used for DT-creation; (2) assessing the utilized learning strategies (e.g. supervised, unsupervised, self-supervised); (3) analyzing the type of modelling task (e.g. regression, classification, clustering); and (4) identifying the challenges and research gaps, as well as, discuss potential resolutions provided.
