Table of Contents
Fetching ...

xSemAD: Explainable Semantic Anomaly Detection in Event Logs Using Sequence-to-Sequence Models

Kiran Busch, Timotheus Kampik, Henrik Leopold

TL;DR

xSemAD tackles the explainability gap in semantic anomaly detection for event logs by learning declarative execution constraints from a large repository of process models. It fine-tunes a sequence-to-sequence model (Flan-T5) on input–target pairs derived from Declare constraints to produce an events2constraints model, which then generates conditionally probable constraints from a log via beam search and a threshold θ. These constraints are checked against the observed log to identify and explain semantic violations at the constraint level, enabling targeted corrective actions. Empirical evaluation on a large, real-world-inspired SAP-SAM dataset shows that xSemAD outperforms state-of-the-art semantic anomaly detectors (SVM, BERT) and declarative discovery methods (MINERful, Decode Miner) across EvF relations and a broad set of constraint types, with robust generalization to unseen event labels and feasible inference times. The work highlights the practical impact of explainable, constraint-centric anomaly detection for process improvement and lays groundwork for future integration with temporal semantics and hybrid detection approaches.

Abstract

The identification of undesirable behavior in event logs is an important aspect of process mining that is often addressed by anomaly detection methods. Traditional anomaly detection methods tend to focus on statistically rare behavior and neglect the subtle difference between rarity and undesirability. The introduction of semantic anomaly detection has opened a promising avenue by identifying semantically deviant behavior. This work addresses a gap in semantic anomaly detection, which typically indicates the occurrence of an anomaly without explaining the nature of the anomaly. We propose xSemAD, an approach that uses a sequence-to-sequence model to go beyond pure identification and provides extended explanations. In essence, our approach learns constraints from a given process model repository and then checks whether these constraints hold in the considered event log. This approach not only helps understand the specifics of the undesired behavior, but also facilitates targeted corrective actions. Our experiments demonstrate that our approach outperforms existing state-of-the-art semantic anomaly detection methods.

xSemAD: Explainable Semantic Anomaly Detection in Event Logs Using Sequence-to-Sequence Models

TL;DR

xSemAD tackles the explainability gap in semantic anomaly detection for event logs by learning declarative execution constraints from a large repository of process models. It fine-tunes a sequence-to-sequence model (Flan-T5) on input–target pairs derived from Declare constraints to produce an events2constraints model, which then generates conditionally probable constraints from a log via beam search and a threshold θ. These constraints are checked against the observed log to identify and explain semantic violations at the constraint level, enabling targeted corrective actions. Empirical evaluation on a large, real-world-inspired SAP-SAM dataset shows that xSemAD outperforms state-of-the-art semantic anomaly detectors (SVM, BERT) and declarative discovery methods (MINERful, Decode Miner) across EvF relations and a broad set of constraint types, with robust generalization to unseen event labels and feasible inference times. The work highlights the practical impact of explainable, constraint-centric anomaly detection for process improvement and lays groundwork for future integration with temporal semantics and hybrid detection approaches.

Abstract

The identification of undesirable behavior in event logs is an important aspect of process mining that is often addressed by anomaly detection methods. Traditional anomaly detection methods tend to focus on statistically rare behavior and neglect the subtle difference between rarity and undesirability. The introduction of semantic anomaly detection has opened a promising avenue by identifying semantically deviant behavior. This work addresses a gap in semantic anomaly detection, which typically indicates the occurrence of an anomaly without explaining the nature of the anomaly. We propose xSemAD, an approach that uses a sequence-to-sequence model to go beyond pure identification and provides extended explanations. In essence, our approach learns constraints from a given process model repository and then checks whether these constraints hold in the considered event log. This approach not only helps understand the specifics of the undesired behavior, but also facilitates targeted corrective actions. Our experiments demonstrate that our approach outperforms existing state-of-the-art semantic anomaly detection methods.
Paper Structure (12 sections, 3 figures, 4 tables)

This paper contains 12 sections, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Anomalous traces in a loan application process
  • Figure 2: Overview of our xSemAD approach.
  • Figure 3: Precision, recall, and $F_{1}$ scores for each constraint type and different $\theta$ values