Table of Contents
Fetching ...

FaultExplainer: Leveraging Large Language Models for Interpretable Fault Detection and Diagnosis

Abdullah Khan, Rahul Nahar, Hao Chen, Gonzalo E. Constante Flores, Can Li

TL;DR

FaultExplainer presents an LLM-enabled framework for interpretable fault detection, diagnosis, and explanation in chemical processes by grounding LLM reasoning in PCA-based $T^2$ fault detection and a detailed Tennessee Eastman Process description. The method combines $T^2$ statistics, feature contribution analysis, and two targeted prompts to generate grounded root-cause explanations, evaluated on 15 TEP faults with GPT-4o and o1-preview. Results show plausible, actionable explanations in many cases, but also highlight failures due to PCA feature limitations and natural language model hallucinations, especially for unseen fault scenarios. The work demonstrates a practical, open-source tool for real-time fault monitoring and operator-oriented explanations, with clear avenues for advancing feature selection and domain-specific LLM training to improve reliability and interpretability in industrial settings.

Abstract

Machine learning algorithms are increasingly being applied to fault detection and diagnosis (FDD) in chemical processes. However, existing data-driven FDD platforms often lack interpretability for process operators and struggle to identify root causes of previously unseen faults. This paper presents FaultExplainer, an interactive tool designed to improve fault detection, diagnosis, and explanation in the Tennessee Eastman Process (TEP). FaultExplainer integrates real-time sensor data visualization, Principal Component Analysis (PCA)-based fault detection, and identification of top contributing variables within an interactive user interface powered by large language models (LLMs). We evaluate the LLMs' reasoning capabilities in two scenarios: one where historical root causes are provided, and one where they are not to mimic the challenge of previously unseen faults. Experimental results using GPT-4o and o1-preview models demonstrate the system's strengths in generating plausible and actionable explanations, while also highlighting its limitations, including reliance on PCA-selected features and occasional hallucinations.

FaultExplainer: Leveraging Large Language Models for Interpretable Fault Detection and Diagnosis

TL;DR

FaultExplainer presents an LLM-enabled framework for interpretable fault detection, diagnosis, and explanation in chemical processes by grounding LLM reasoning in PCA-based fault detection and a detailed Tennessee Eastman Process description. The method combines statistics, feature contribution analysis, and two targeted prompts to generate grounded root-cause explanations, evaluated on 15 TEP faults with GPT-4o and o1-preview. Results show plausible, actionable explanations in many cases, but also highlight failures due to PCA feature limitations and natural language model hallucinations, especially for unseen fault scenarios. The work demonstrates a practical, open-source tool for real-time fault monitoring and operator-oriented explanations, with clear avenues for advancing feature selection and domain-specific LLM training to improve reliability and interpretability in industrial settings.

Abstract

Machine learning algorithms are increasingly being applied to fault detection and diagnosis (FDD) in chemical processes. However, existing data-driven FDD platforms often lack interpretability for process operators and struggle to identify root causes of previously unseen faults. This paper presents FaultExplainer, an interactive tool designed to improve fault detection, diagnosis, and explanation in the Tennessee Eastman Process (TEP). FaultExplainer integrates real-time sensor data visualization, Principal Component Analysis (PCA)-based fault detection, and identification of top contributing variables within an interactive user interface powered by large language models (LLMs). We evaluate the LLMs' reasoning capabilities in two scenarios: one where historical root causes are provided, and one where they are not to mimic the challenge of previously unseen faults. Experimental results using GPT-4o and o1-preview models demonstrate the system's strengths in generating plausible and actionable explanations, while also highlighting its limitations, including reliance on PCA-selected features and occasional hallucinations.

Paper Structure

This paper contains 16 sections, 4 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Interactive Web Interface for FaultExplainer. From top to bottom are the process monitoring window, fault history tracker, and the interactive chat interface.
  • Figure 2: Overview of the Methods in FaultExplainer
  • Figure 3: An illustrative example where both models are able to identify the correct root causes. The top six feature changes and the explanation of GPT-4o and o1-preview to the top six feature changes of Fault 7 (C Header Pressure Loss - Reduced Availability (Stream 4) & Step) when the 15 root causes are provided.
  • Figure 4: An illustrative example where neither model is able to identify the correct root cause. The top six feature changes and the explanation of GPT-4o and o1-preview to the top six feature changes of Fault 10 C Feed Temperature (Stream 4) & Random Variation when the 15 root causes are provided. The hallucinated explanations are shown in italics.
  • Figure 5: The top six feature changes and the explanation of GPT-4o and o1-preview to the top six feature changes of Fault 2 (B Composition, A/C Ratio Constant (Stream 4) & Step) when the 15 root causes are not provided. The hallucinated explanations are highlighted in italics.
  • ...and 1 more figures