A Scalable and Near-Optimal Conformance Checking Approach for Long Traces

Eli Bogdanov; Izack Cohen; Avigdor Gal

A Scalable and Near-Optimal Conformance Checking Approach for Long Traces

Eli Bogdanov, Izack Cohen, Avigdor Gal

TL;DR

This work tackles the challenge of conformance checking for long traces by partitioning traces into subtraces of length $L$ and solving alignments within a sliding window, reducing the search space to $W=\lceil N/L\rceil$ windows. It introduces a global-information-driven pruning mechanism via a marginal-cost lower bound and maintains model state across subtraces to preserve coherence, enabling near-optimal alignments at scale. The approach is formalized with trace/subtrace models and a cost framework, analyzed for complexity, and validated on classic and long-trace food-preparation datasets, achieving optimal alignments in over $96\%$ of traces with a small average deviation of $0.66\%$. The resulting method offers scalable, interpretable conformance checking for large-scale sensor and prediction-model-generated process logs, with practical impact on real-world process mining tasks.

Abstract

Long traces and large event logs that originate from sensors and prediction models are becoming more common in our data-rich world. In such circumstances, conformance checking, a key task in process mining, can become computationally infeasible due to the exponential complexity of finding an optimal alignment. This paper introduces a novel sliding window approach to address these scalability challenges while preserving the interpretability of alignment-based methods. By breaking down traces into manageable subtraces and iteratively aligning each with the process model, our method significantly reduces the search space. The approach uses global information that captures structural properties of the trace and the process model to make informed alignment decisions, discarding unpromising alignments even if they are optimal for a local subtrace. This improves the overall accuracy of the results. Experimental evaluations demonstrate that the proposed method consistently finds optimal alignments in most cases and highlight its scalability. This is further supported by a theoretical complexity analysis, which shows the reduced growth of the search space compared to other common conformance checking methods. This work provides a valuable contribution towards efficient conformance checking for large-scale process mining applications.

A Scalable and Near-Optimal Conformance Checking Approach for Long Traces

TL;DR

This work tackles the challenge of conformance checking for long traces by partitioning traces into subtraces of length

and solving alignments within a sliding window, reducing the search space to

windows. It introduces a global-information-driven pruning mechanism via a marginal-cost lower bound and maintains model state across subtraces to preserve coherence, enabling near-optimal alignments at scale. The approach is formalized with trace/subtrace models and a cost framework, analyzed for complexity, and validated on classic and long-trace food-preparation datasets, achieving optimal alignments in over

of traces with a small average deviation of

. The resulting method offers scalable, interpretable conformance checking for large-scale sensor and prediction-model-generated process logs, with practical impact on real-world process mining tasks.

Abstract

Paper Structure (13 sections, 1 equation, 3 figures, 2 tables, 1 algorithm)

This paper contains 13 sections, 1 equation, 3 figures, 2 tables, 1 algorithm.

Introduction
Modeling and Definitions
Algorithmic Design and Implementation
The Algorithmic Approach
Embedding Local and Global Information
Sliding Window Mechanism and Iterative Alignment
Illustrative Example of Algorithm Execution
Complexity Analysis
Empirical Evaluation
Classic Datasets
Long Traces of Food Preparation Datasets
Related Work
Conclusion and Future Directions

Figures (3)

Figure 1: An example of a process model.
Figure 2: An example of a trace model.
Figure 3: Markings and alignment costs for subtraces. (\ref{['subfig:table1']}-\ref{['subfig:table2']}) show the alignments for the first subtrace. (\ref{['subfig:table3']}, \ref{['subfig:table5']}) present alignments starting from $[p_{2},p'_3]$. (\ref{['subfig:table4']}, \ref{['subfig:table6']}) present alignments starting from $[p_{0},p'_3]$. (\ref{['subfig:table7']}-\ref{['subfig:table8']}) present the lowest cost alignments for the last subtrace starting from $[p_{3},p'_6]$ and $[p_{2},p'_6]$, respectively.

Theorems & Definitions (2)

Definition 1: Subtrace Model
Definition 2: Partial Optimal Alignment

A Scalable and Near-Optimal Conformance Checking Approach for Long Traces

TL;DR

Abstract

A Scalable and Near-Optimal Conformance Checking Approach for Long Traces

Authors

TL;DR

Abstract

Table of Contents

Figures (3)

Theorems & Definitions (2)