Table of Contents
Fetching ...

Benchmarking Domain Adaptation for Chemical Processes on the Tennessee Eastman Process

Eduardo Fernandes Montesuma, Michela Mulas, Fred Ngolè Mboula, Francesco Corona, Antoine Souloumiac

TL;DR

The paper addresses distribution shifts in fault-diagnosis for chemical processes by introducing a Tennessee Eastman Process–based benchmark and evaluating 11 domain adaptation strategies across single- and multi-source settings. It emphasizes optimal transport–based methods (e.g., JDOT, WJDOT, WBT, DaDiL) as outperforming $MMD$ and $d_{\mathcal{H}}$-based approaches, with multi-source DA offering gains when sources are informative. The study provides a detailed benchmark construction, exploratory data analysis, and a comprehensive comparison on time-series data, including a public open-source implementation to facilitate replication. The work highlights practical implications for robust cross-mode fault diagnosis and motivates further research at the intersection of DA and chemical-process monitoring.

Abstract

In system monitoring, automatic fault diagnosis seeks to infer the systems' state based on sensor readings, e.g., through machine learning models. In this context, it is of key importance that, based on historical data, these systems are able to generalize to incoming data. In parallel, many factors may induce changes in the data probability distribution, hindering the possibility of such models to generalize. In this sense, domain adaptation is an important framework for adapting models to different probability distributions. In this paper, we propose a new benchmark, based on the Tennessee Eastman Process of Downs and Vogel (1993), for benchmarking domain adaptation methods in the context of chemical processes. Besides describing the process, and its relevance for domain adaptation, we describe a series of data processing steps for reproducing our benchmark. We then test 11 domain adaptation strategies on this novel benchmark, showing that optimal transport-based techniques outperform other strategies.

Benchmarking Domain Adaptation for Chemical Processes on the Tennessee Eastman Process

TL;DR

The paper addresses distribution shifts in fault-diagnosis for chemical processes by introducing a Tennessee Eastman Process–based benchmark and evaluating 11 domain adaptation strategies across single- and multi-source settings. It emphasizes optimal transport–based methods (e.g., JDOT, WJDOT, WBT, DaDiL) as outperforming and -based approaches, with multi-source DA offering gains when sources are informative. The study provides a detailed benchmark construction, exploratory data analysis, and a comprehensive comparison on time-series data, including a public open-source implementation to facilitate replication. The work highlights practical implications for robust cross-mode fault diagnosis and motivates further research at the intersection of DA and chemical-process monitoring.

Abstract

In system monitoring, automatic fault diagnosis seeks to infer the systems' state based on sensor readings, e.g., through machine learning models. In this context, it is of key importance that, based on historical data, these systems are able to generalize to incoming data. In parallel, many factors may induce changes in the data probability distribution, hindering the possibility of such models to generalize. In this sense, domain adaptation is an important framework for adapting models to different probability distributions. In this paper, we propose a new benchmark, based on the Tennessee Eastman Process of Downs and Vogel (1993), for benchmarking domain adaptation methods in the context of chemical processes. Besides describing the process, and its relevance for domain adaptation, we describe a series of data processing steps for reproducing our benchmark. We then test 11 domain adaptation strategies on this novel benchmark, showing that optimal transport-based techniques outperform other strategies.
Paper Structure (8 sections, 9 equations, 9 figures, 5 tables)

This paper contains 8 sections, 9 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: Illustration of a deep neural net, where data $\mathbf{x}_{i}^{(P)}$ are mapped into latent representation vectors $\mathbf{z}_{i}^{(P)}$ through an encoder $\phi$. The latent representation is then used to predict a class, i.e., $\hat{y}_{i}^{(P)}$.
  • Figure 2: Domain adaptation based on data transformation. In an ambient space, source and target data follow different probability distributions. As a result, a classifier learned on the source (blue straight line on the left) is not able to generalize on data from the target domain (orange elements). In this paper we consider methods that align the distributions through a data transformation $T$, which maps data into a latent space.
  • Figure 3: P&ID diagram for the . Figure reproduced from bathelt2015revision, which shows the main components of the process. Measurements originally introduced by downs1993plant are shown in gray, whereas the measurements introduced by bathelt2015revision are shown in red. A simulation environment, based on this diagram, is described in reinartz2021extended.
  • Figure 4: Qualitative analysis of distributional shift. In (a) and (b), we show the correlation between different variables in , for modes 1 and 2, for each fault. On each correlation matrix, the coefficient $\rho_{jj'}$ corresponds to the Pearson correlation coefficient between $\{x_{j,t}\}_{t=1}^{600}$ and $\{x_{j',t}\}_{t=1}^{600}$ across simulations.
  • Figure 5: Quantitative analysis of distributional shift. Pairwise Wasserstein distance between modes (a). Mode embeddings based on MDS (b).
  • ...and 4 more figures