Table of Contents
Fetching ...

Failure Detection in Chemical Processes using Symbolic Machine Learning: A Case Study on Ethylene Oxidation

Julien Amblard, Niklas Groll, Matthew Tait, Mark Law, Gürkan Sin, Alessandra Russo

TL;DR

Experimental results show that symbolic machine learning can outperform baseline methods such as random forest and multilayer perceptron, while preserving interpretability through the generation of compact, rule-based predictive models.

Abstract

Over the past decade, Artificial Intelligence has significantly advanced, mostly driven by large-scale neural approaches. However, in the chemical process industry, where safety is critical, these methods are often unsuitable due to their brittleness, and lack of explainability and interpretability. Furthermore, open-source real-world datasets containing historical failures are scarce in this domain. In this paper, we investigate an approach for predicting failures in chemical processes using symbolic machine learning and conduct a feasibility study in the context of an ethylene oxidation process. Our method builds on a state-of-the-art symbolic machine learning system capable of learning predictive models in the form of probabilistic rules from context-dependent noisy examples. This system is a general-purpose symbolic learner, which makes our approach independent of any specific chemical process. To address the lack of real-world failure data, we conduct our feasibility study leveraging data generated from a chemical process simulator. Experimental results show that symbolic machine learning can outperform baseline methods such as random forest and multilayer perceptron, while preserving interpretability through the generation of compact, rule-based predictive models. Finally, we explain how such learned rule-based models could be integrated into agents to assist chemical plant operators in decision-making during potential failures.

Failure Detection in Chemical Processes using Symbolic Machine Learning: A Case Study on Ethylene Oxidation

TL;DR

Experimental results show that symbolic machine learning can outperform baseline methods such as random forest and multilayer perceptron, while preserving interpretability through the generation of compact, rule-based predictive models.

Abstract

Over the past decade, Artificial Intelligence has significantly advanced, mostly driven by large-scale neural approaches. However, in the chemical process industry, where safety is critical, these methods are often unsuitable due to their brittleness, and lack of explainability and interpretability. Furthermore, open-source real-world datasets containing historical failures are scarce in this domain. In this paper, we investigate an approach for predicting failures in chemical processes using symbolic machine learning and conduct a feasibility study in the context of an ethylene oxidation process. Our method builds on a state-of-the-art symbolic machine learning system capable of learning predictive models in the form of probabilistic rules from context-dependent noisy examples. This system is a general-purpose symbolic learner, which makes our approach independent of any specific chemical process. To address the lack of real-world failure data, we conduct our feasibility study leveraging data generated from a chemical process simulator. Experimental results show that symbolic machine learning can outperform baseline methods such as random forest and multilayer perceptron, while preserving interpretability through the generation of compact, rule-based predictive models. Finally, we explain how such learned rule-based models could be integrated into agents to assist chemical plant operators in decision-making during potential failures.
Paper Structure (13 sections, 12 equations, 5 figures, 4 tables)

This paper contains 13 sections, 12 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Simulation flowsheet of the EO process, with a legend to identify components
  • Figure 2: A perturbation file for "low pressure at source"
  • Figure 3: ROC curves generated from the default learning parameter configuration, comparing our approach's performance to that of the baselines
  • Figure 4: Rules generated by DisPLAS for $T_{static}$ ($\texttt{low2} < \texttt{low1} < \texttt{normal} < \texttt{high1} < \texttt{high2}$) when including all nontrivial experiments and process variable measurements over 125 runs of the simulation
  • Figure 5: Rules generated by DisPLAS for $T_{dynamic}$, modifying only the learning parameter $t_{short\_term}$ — the predicate unchanged indicates process variables with equal initial and short-term values, while remaining predicates in rule bodies refer to measurements taken at $t_{short\_term}$ and (absolute) changes in measurement since $t=0$ (increase: $\nearrow$, decrease: $\searrow$)

Theorems & Definitions (3)

  • Definition 1
  • Definition 2
  • Definition 3