Table of Contents
Fetching ...

Timeline-based Process Discovery

Harleen Kaur, Jan Mendling, Christoffer Rubensson, Timotheus Kampik

TL;DR

This work addresses the lack of explicit time-axis representation in automatic process discovery by introducing a timeline-based DFG layout that aligns activities along a time axis. The approach computes relative times from event logs using $t_{min}(C)$, applies a relative transformation $f(C)$, and derives $g(a)$ to obtain average occurrence times, which then informs a time-axis mapping and vertical alignment of activities. Implemented as a PM4Py fork with Graphviz output, the method is evaluated on BPIC 2012, BPIC 2017, and a proprietary sales dataset, demonstrating improved temporal ordering, phase-like segmentation, and holistic performance cues compared to standard layouts, while highlighting limitations in precise timing and causal interpretation. The results suggest that timeline-based process models can enhance analyst insights into waiting times, bottlenecks, and duration differences across paths, motivating further work on performance-focused visual analytics in process mining.

Abstract

A key concern of automatic process discovery is to provide insights into performance aspects of business processes. Waiting times are of particular importance in this context. For that reason, it is surprising that current techniques for automatic process discovery generate directly-follows graphs and comparable process models, but often miss the opportunity to explicitly represent the time axis. In this paper, we present an approach for automatically constructing process models that explicitly align with a time axis. We exemplify our approach for directly-follows graphs. Our evaluation using two BPIC datasets and a proprietary dataset highlight the benefits of this representation in comparison to standard layout techniques.

Timeline-based Process Discovery

TL;DR

This work addresses the lack of explicit time-axis representation in automatic process discovery by introducing a timeline-based DFG layout that aligns activities along a time axis. The approach computes relative times from event logs using , applies a relative transformation , and derives to obtain average occurrence times, which then informs a time-axis mapping and vertical alignment of activities. Implemented as a PM4Py fork with Graphviz output, the method is evaluated on BPIC 2012, BPIC 2017, and a proprietary sales dataset, demonstrating improved temporal ordering, phase-like segmentation, and holistic performance cues compared to standard layouts, while highlighting limitations in precise timing and causal interpretation. The results suggest that timeline-based process models can enhance analyst insights into waiting times, bottlenecks, and duration differences across paths, motivating further work on performance-focused visual analytics in process mining.

Abstract

A key concern of automatic process discovery is to provide insights into performance aspects of business processes. Waiting times are of particular importance in this context. For that reason, it is surprising that current techniques for automatic process discovery generate directly-follows graphs and comparable process models, but often miss the opportunity to explicitly represent the time axis. In this paper, we present an approach for automatically constructing process models that explicitly align with a time axis. We exemplify our approach for directly-follows graphs. Our evaluation using two BPIC datasets and a proprietary dataset highlight the benefits of this representation in comparison to standard layout techniques.
Paper Structure (16 sections, 5 figures)

This paper contains 16 sections, 5 figures.

Figures (5)

  • Figure 1: Diagram showing how the relative time is calculated for each case in an event log. First, the event log is sorted according to the case ID and timestamp. For each case, the earliest activity is taken to be the start activity, which is activity $A$ in this diagram, highlighted in yellow. Case ID $10$ has a loop as activity $B$ is being repeated within a case. Each case is then split into a sub-table for each activity to deal with loops. A mean value is calculated for each sub-table's timestamp column. The relative time is calculated by subtracting the initial timestamp value from the mean of the value of all timestamps for a given activity.
  • Figure 2: A simple example of a visualization result (right) after generating the DOT script (left) from directly-follows and time relations of three activities.
  • Figure 3: BPI Challenge '12 event log: a simple DFG with standard layout and a DFG based on a timeline showing frequencies.
  • Figure 4: BPI Challenge '17 event log: a simple DFG and DFG based on a timeline showing frequencies.
  • Figure 5: Proprietary sales process dataset: a simple DFG and a DFG based on a timeline showing frequencies.