Table of Contents
Fetching ...

Neuro-Symbolic Process Anomaly Detection

Devashish Gaikwad, Wil M. P. van der Aalst, Gyunam Park

Abstract

Process anomaly detection is an important application of process mining for identifying deviations from the normal behavior of a process. Neural network-based methods have recently been applied to this task, learning directly from event logs without requiring a predefined process model. However, since anomaly detection is a purely statistical task, these models fail to incorporate human domain knowledge. As a result, rare but conformant traces are often misclassified as anomalies due to their low frequency, which limits the effectiveness of the detection process. Recent developments in the field of neuro-symbolic AI have introduced Logic Tensor Networks (LTN) as a means to integrate symbolic knowledge into neural networks using real-valued logic. In this work, we propose a neuro-symbolic approach that integrates domain knowledge into neural anomaly detection using LTN and Declare constraints. Using autoencoder models as a foundation, we encode Declare constraints as soft logical guiderails within the learning process to distinguish between anomalous and rare but conformant behavior. Evaluations on synthetic and real-world datasets demonstrate that our approach improves F1 scores even when as few as 10 conformant traces exist, and that the choice of Declare constraint and by extension human domain knowledge significantly influences performance gains.

Neuro-Symbolic Process Anomaly Detection

Abstract

Process anomaly detection is an important application of process mining for identifying deviations from the normal behavior of a process. Neural network-based methods have recently been applied to this task, learning directly from event logs without requiring a predefined process model. However, since anomaly detection is a purely statistical task, these models fail to incorporate human domain knowledge. As a result, rare but conformant traces are often misclassified as anomalies due to their low frequency, which limits the effectiveness of the detection process. Recent developments in the field of neuro-symbolic AI have introduced Logic Tensor Networks (LTN) as a means to integrate symbolic knowledge into neural networks using real-valued logic. In this work, we propose a neuro-symbolic approach that integrates domain knowledge into neural anomaly detection using LTN and Declare constraints. Using autoencoder models as a foundation, we encode Declare constraints as soft logical guiderails within the learning process to distinguish between anomalous and rare but conformant behavior. Evaluations on synthetic and real-world datasets demonstrate that our approach improves F1 scores even when as few as 10 conformant traces exist, and that the choice of Declare constraint and by extension human domain knowledge significantly influences performance gains.

Paper Structure

This paper contains 21 sections, 3 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Comparison of classification of frequent and conformant traces (green tiles), rare but conformant traces (blue tile) and infrequent anomalous traces (red tiles).
  • Figure 2: Overview of the proposed method. The event log is used both to pretrain an autoencoder for trace reconstruction and to mine Declare constraints. Domain knowledge guides the selection of constraints representing rare but conformant behavior, which are then injected into the autoencoder via LTN for fine-tuning. The enhanced model performs anomaly detection based on reconstruction error.
  • Figure 3: Example from the Paper event log showing the Response Declare constraint. If activity Develop Method occurs, then Final Decision must eventually follow.
  • Figure 4: Learning a binary classifier predicate $A_{\theta}$ using LTN. The neural network outputs are interpreted as truth values and aggregated using logical quantifiers to compute overall satisfiability, which guides parameter learning.
  • Figure 5: Fine-tuning the autoencoder $AE_{\theta}$ with LTN. Predicates $P_1$ to $P_n$ extract activity probabilities from reconstructed traces, which are used to evaluate Declare constraint satisfiability and guide learning.
  • ...and 2 more figures