Table of Contents
Fetching ...

A Scalable and Near-Optimal Conformance Checking Approach for Long Traces

Eli Bogdanov, Izack Cohen, Avigdor Gal

TL;DR

This work tackles the challenge of conformance checking for long traces by partitioning traces into subtraces of length $L$ and solving alignments within a sliding window, reducing the search space to $W=\lceil N/L\rceil$ windows. It introduces a global-information-driven pruning mechanism via a marginal-cost lower bound and maintains model state across subtraces to preserve coherence, enabling near-optimal alignments at scale. The approach is formalized with trace/subtrace models and a cost framework, analyzed for complexity, and validated on classic and long-trace food-preparation datasets, achieving optimal alignments in over $96\%$ of traces with a small average deviation of $0.66\%$. The resulting method offers scalable, interpretable conformance checking for large-scale sensor and prediction-model-generated process logs, with practical impact on real-world process mining tasks.

Abstract

Long traces and large event logs that originate from sensors and prediction models are becoming more common in our data-rich world. In such circumstances, conformance checking, a key task in process mining, can become computationally infeasible due to the exponential complexity of finding an optimal alignment. This paper introduces a novel sliding window approach to address these scalability challenges while preserving the interpretability of alignment-based methods. By breaking down traces into manageable subtraces and iteratively aligning each with the process model, our method significantly reduces the search space. The approach uses global information that captures structural properties of the trace and the process model to make informed alignment decisions, discarding unpromising alignments even if they are optimal for a local subtrace. This improves the overall accuracy of the results. Experimental evaluations demonstrate that the proposed method consistently finds optimal alignments in most cases and highlight its scalability. This is further supported by a theoretical complexity analysis, which shows the reduced growth of the search space compared to other common conformance checking methods. This work provides a valuable contribution towards efficient conformance checking for large-scale process mining applications.

A Scalable and Near-Optimal Conformance Checking Approach for Long Traces

TL;DR

This work tackles the challenge of conformance checking for long traces by partitioning traces into subtraces of length and solving alignments within a sliding window, reducing the search space to windows. It introduces a global-information-driven pruning mechanism via a marginal-cost lower bound and maintains model state across subtraces to preserve coherence, enabling near-optimal alignments at scale. The approach is formalized with trace/subtrace models and a cost framework, analyzed for complexity, and validated on classic and long-trace food-preparation datasets, achieving optimal alignments in over of traces with a small average deviation of . The resulting method offers scalable, interpretable conformance checking for large-scale sensor and prediction-model-generated process logs, with practical impact on real-world process mining tasks.

Abstract

Long traces and large event logs that originate from sensors and prediction models are becoming more common in our data-rich world. In such circumstances, conformance checking, a key task in process mining, can become computationally infeasible due to the exponential complexity of finding an optimal alignment. This paper introduces a novel sliding window approach to address these scalability challenges while preserving the interpretability of alignment-based methods. By breaking down traces into manageable subtraces and iteratively aligning each with the process model, our method significantly reduces the search space. The approach uses global information that captures structural properties of the trace and the process model to make informed alignment decisions, discarding unpromising alignments even if they are optimal for a local subtrace. This improves the overall accuracy of the results. Experimental evaluations demonstrate that the proposed method consistently finds optimal alignments in most cases and highlight its scalability. This is further supported by a theoretical complexity analysis, which shows the reduced growth of the search space compared to other common conformance checking methods. This work provides a valuable contribution towards efficient conformance checking for large-scale process mining applications.
Paper Structure (13 sections, 1 equation, 3 figures, 2 tables, 1 algorithm)

This paper contains 13 sections, 1 equation, 3 figures, 2 tables, 1 algorithm.

Figures (3)

  • Figure 1: An example of a process model.
  • Figure 2: An example of a trace model.
  • Figure 3: Markings and alignment costs for subtraces. (\ref{['subfig:table1']}-\ref{['subfig:table2']}) show the alignments for the first subtrace. (\ref{['subfig:table3']}, \ref{['subfig:table5']}) present alignments starting from $[p_{2},p'_3]$. (\ref{['subfig:table4']}, \ref{['subfig:table6']}) present alignments starting from $[p_{0},p'_3]$. (\ref{['subfig:table7']}-\ref{['subfig:table8']}) present the lowest cost alignments for the last subtrace starting from $[p_{3},p'_6]$ and $[p_{2},p'_6]$, respectively.

Theorems & Definitions (2)

  • Definition 1: Subtrace Model
  • Definition 2: Partial Optimal Alignment