Table of Contents
Fetching ...

AAD-LLM: Adaptive Anomaly Detection Using Large Language Models

Alicia Russell-Gilbert, Alexander Sommers, Andrew Thompson, Logan Cummins, Sudip Mittal, Shahram Rahimi, Maria Seale, Joseph Jaboure, Thomas Arnold, Joshua Church

TL;DR

Results suggest that anomaly detection can be converted into a "language" task to deliver effective, context-aware detection in data-constrained industrial applications.

Abstract

For data-constrained, complex and dynamic industrial environments, there is a critical need for transferable and multimodal methodologies to enhance anomaly detection and therefore, prevent costs associated with system failures. Typically, traditional PdM approaches are not transferable or multimodal. This work examines the use of Large Language Models (LLMs) for anomaly detection in complex and dynamic manufacturing systems. The research aims to improve the transferability of anomaly detection models by leveraging Large Language Models (LLMs) and seeks to validate the enhanced effectiveness of the proposed approach in data-sparse industrial applications. The research also seeks to enable more collaborative decision-making between the model and plant operators by allowing for the enriching of input series data with semantics. Additionally, the research aims to address the issue of concept drift in dynamic industrial settings by integrating an adaptability mechanism. The literature review examines the latest developments in LLM time series tasks alongside associated adaptive anomaly detection methods to establish a robust theoretical framework for the proposed architecture. This paper presents a novel model framework (AAD-LLM) that doesn't require any training or finetuning on the dataset it is applied to and is multimodal. Results suggest that anomaly detection can be converted into a "language" task to deliver effective, context-aware detection in data-constrained industrial applications. This work, therefore, contributes significantly to advancements in anomaly detection methodologies.

AAD-LLM: Adaptive Anomaly Detection Using Large Language Models

TL;DR

Results suggest that anomaly detection can be converted into a "language" task to deliver effective, context-aware detection in data-constrained industrial applications.

Abstract

For data-constrained, complex and dynamic industrial environments, there is a critical need for transferable and multimodal methodologies to enhance anomaly detection and therefore, prevent costs associated with system failures. Typically, traditional PdM approaches are not transferable or multimodal. This work examines the use of Large Language Models (LLMs) for anomaly detection in complex and dynamic manufacturing systems. The research aims to improve the transferability of anomaly detection models by leveraging Large Language Models (LLMs) and seeks to validate the enhanced effectiveness of the proposed approach in data-sparse industrial applications. The research also seeks to enable more collaborative decision-making between the model and plant operators by allowing for the enriching of input series data with semantics. Additionally, the research aims to address the issue of concept drift in dynamic industrial settings by integrating an adaptability mechanism. The literature review examines the latest developments in LLM time series tasks alongside associated adaptive anomaly detection methods to establish a robust theoretical framework for the proposed architecture. This paper presents a novel model framework (AAD-LLM) that doesn't require any training or finetuning on the dataset it is applied to and is multimodal. Results suggest that anomaly detection can be converted into a "language" task to deliver effective, context-aware detection in data-constrained industrial applications. This work, therefore, contributes significantly to advancements in anomaly detection methodologies.

Paper Structure

This paper contains 14 sections, 4 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Process flow diagram of major components in our use-case extrusion process. The major components in the extrusion process are in a series configuration. The number of Feed and Screw/Barrel Systems depends on the manufacturing line number and can be 3, 4, or 5.
  • Figure 2: The die head system for our use-case. The screen pack changer is identified by a red box. Within the screen pack changer, screen packs are used to prevent impurities from getting into the extruder together with the resin and thus clogging the die gap. The number of screen packs depend on the number of Screw/Barrel Systems. Each screen pack is arranged between the Screw/Barrel System and the Die Head System. During production, the resin melts flow through the screen pack.
  • Figure 3: SPC technique of moving average moving range to set control limits for process stability in a query series $Q_i$. Figure A and Figure B are moving average and moving range, respectively. UCL is the defined upper control limit and LCL is the defined lower control limit. Series data points outside of control limits are deemed "out of statistical control" and are labeled as anomalous. Out of control points can be seen before line (1). Points between lines (1) and (2) represent a stable process. Points after line (2) also represent a stable process, however, they are trending towards out of control. These points, therefore, are potentially problematic. AAD-LLM is applied to all points within control limits to enhance anomaly detection.
  • Figure 4: The model framework of AAD-LLM. Given an input time series $Q$ from the dataset $D$ under consideration, we first preprocess it using SPC techniques. Then (1)$Q$ is partitioned into a comparison dataset $C$ and query windows $Q^{(p)}$, where $p \in P$ and $P$ is the number of segmented windows. Next, statistical derivatives for $C$ and $Q^{(p)}$ are calculated and (2) injected into text templates. These templates are combined with task instructions to create the input prompt. To enhance the LLM's reasoning ability, (3) domain context is added to the prompt before being fed forward through the frozen LLM. The output from the LLM is (4) mapped to $\{0,1\}$ via a binarization function to obtain the final prediction. (5) Updates to $C$ are determined before moving to the next $Q^{(p)}$.
  • Figure 5: Prompt example. $<$cached info$>$ is the domain context information. $<$val$>$ are calculated statistical derivatives injected into respective text templates. Note that although each $Q_i$ is processed independently, prompts include text templates for all $i \in N$ where $N$ is the number of input variables in instance $Q$ from the dataset $D$ under consideration. Therefore, multivariate anomaly detection is explored.