Table of Contents
Fetching ...

Conformance Checking for Less: Efficient Conformance Checking for Long Event Sequences

Eli Bogdanov, Izack Cohen, Avigdor Gal

TL;DR

ConFORMANCE: ConLES addresses the intractable conformance checking of long event sequences by introducing a sliding-window approach that confines exponential search to small subtraces while preserving the process model's state across windows. By leveraging both local and global information, ConLES computes partial and final alignments with a bounded search depth and maintains near-optimal accuracy. The method is supported by a formal complexity analysis and extensive empirical evaluation showing substantial speedups over optimal methods and robustness on poorly fitting models and very long traces. This yields a scalable, interpretable conformance checking solution suitable for real-world logs with thousands of events and both predefined and discovered process models.

Abstract

Long event sequences (termed traces) and large data logs that originate from sensors and prediction models are becoming increasingly common in our data-rich world. In such scenarios, conformance checking-validating a data log against an expected system behavior (the process model) can become computationally infeasible due to the exponential complexity of finding an optimal alignment. To alleviate scalability challenges for this task, we propose ConLES, a sliding-window conformance checking approach for long event sequences that preserves the interpretability of alignment-based methods. ConLES partitions traces into manageable subtraces and iteratively aligns each against the expected behavior, leading to significant reduction of the search space while maintaining overall accuracy. We use global information that captures structural properties of both the trace and the process model, enabling informed alignment decisions and discarding unpromising alignments, even if they appear locally optimal. Performance evaluations across multiple datasets highlight that ConLES outperforms the leading optimal and heuristic algorithms for long traces, consistently achieving the optimal or near-optimal solution. Unlike other conformance methods that struggle with long event sequences, ConLES significantly reduces the search space, scales efficiently, and uniquely supports both predefined and discovered process models, making it a viable and leading option for conformance checking of long event sequences.

Conformance Checking for Less: Efficient Conformance Checking for Long Event Sequences

TL;DR

ConFORMANCE: ConLES addresses the intractable conformance checking of long event sequences by introducing a sliding-window approach that confines exponential search to small subtraces while preserving the process model's state across windows. By leveraging both local and global information, ConLES computes partial and final alignments with a bounded search depth and maintains near-optimal accuracy. The method is supported by a formal complexity analysis and extensive empirical evaluation showing substantial speedups over optimal methods and robustness on poorly fitting models and very long traces. This yields a scalable, interpretable conformance checking solution suitable for real-world logs with thousands of events and both predefined and discovered process models.

Abstract

Long event sequences (termed traces) and large data logs that originate from sensors and prediction models are becoming increasingly common in our data-rich world. In such scenarios, conformance checking-validating a data log against an expected system behavior (the process model) can become computationally infeasible due to the exponential complexity of finding an optimal alignment. To alleviate scalability challenges for this task, we propose ConLES, a sliding-window conformance checking approach for long event sequences that preserves the interpretability of alignment-based methods. ConLES partitions traces into manageable subtraces and iteratively aligns each against the expected behavior, leading to significant reduction of the search space while maintaining overall accuracy. We use global information that captures structural properties of both the trace and the process model, enabling informed alignment decisions and discarding unpromising alignments, even if they appear locally optimal. Performance evaluations across multiple datasets highlight that ConLES outperforms the leading optimal and heuristic algorithms for long traces, consistently achieving the optimal or near-optimal solution. Unlike other conformance methods that struggle with long event sequences, ConLES significantly reduces the search space, scales efficiently, and uniquely supports both predefined and discovered process models, making it a viable and leading option for conformance checking of long event sequences.

Paper Structure

This paper contains 20 sections, 8 equations, 4 figures, 3 tables, 1 algorithm.

Figures (4)

  • Figure 1: An example of a process model, showing the sequence of activities and their relationships.
  • Figure 2: An example of a trace model, illustrating the execution path of a specific process instance.
  • Figure 3: Markings and alignment costs for subtraces. (a) Alignment for first subtrace ending at $[p_{2}, p'_3]$. (b) Alternative alignment for first subtrace ending at $[p_{0}, p'_3]$. (c) Alignment from $[p_{2}, p'_3]$ to $[p_{2}, p'_6]$. (d) Suboptimal alignment from $[p_{0}, p'_3]$ to $[p_{2}, p'_6]$. (e) Alignment from $[p_{2}, p'_3]$ to $[p_{3}, p'_6]$. (f) Suboptimal alignment from $[p_{0}, p'_3]$ to $[p_{3}, p'_6]$. (g) Alignment from $[p_{3}, p'_6]$ to $[p_{4}, p'_9]$. (h) Optimal alignment from $[p_{2}, p'_6]$ to $[p_{4}, p'_9]$.
  • Figure 4: ConLES's alignment time and cost deviation versus window length $L$. As $L$ increases, fewer subtraces are required, but computation time per subtrace grows. Larger $L$ ensures cost convergence to the optimal, with $L=3500$ representing full trace processing.

Theorems & Definitions (7)

  • Definition 1: Labeled Petri Net
  • Definition 2: Trace Model
  • Definition 3: Synchronous Product
  • Definition 4: Cost Function
  • Definition 5: Optimal Alignment
  • Definition 6: Subtrace Model
  • Definition 7: Partial Optimal Alignment