Table of Contents
Fetching ...

DPM-Bench: Benchmark for Distributed Process Mining Algorithms on Cyber-Physical Systems

Hendrik Reiter, Patrick Rathje, Olaf Landsiedel, Wilhelm Hasselbring

TL;DR

The paper addresses the limitations of centralized process mining in CPS by proposing Distributed Process Mining (DPM) and a formal CPS-adapted streaming model. It introduces DPM-Bench, a benchmark framework that uses Hardware Interaction Instructions (HIIs) to quantify costs and supports topology-aware evaluation across central, decentralized, and distributed PM configurations, demonstrated on a three-node setup with generated distributed event data. Key contributions include a formal taxonomy of PM topologies, an extended streaming model for CPS, and a public benchmark tool to compare DPM algorithms and topologies, providing insights into algorithmic behavior and infrastructure requirements. The framework enables engineers to assess hardware and network provisioning for DPM deployments and guides future research toward privacy-preserving, scalable distributed process mining in CPS.

Abstract

Process Mining is established in research and industry systems to analyze and optimize processes based on event data from information systems. Within this work, we accomodate process mining techniques to Cyber-Physical Systems. To capture the distributed and heterogeneous characteristics of data, computational resources, and network communication in CPS, the todays process mining algorithms and techniques must be augmented. Specifically, there is a need for new Distributed Process Mining algorithms that enable computations to be performed directly on edge resources, eliminating the need for moving all data to central cloud systems. This paper introduces the DPM-Bench benchmark for comparing such Distributed Process Mining algorithms. DPM-Bench is used to compare algorithms deployed in different computational topologies. The results enable information system engineers to assess whether the existing infrastructure is sufficient to perform distributed process mining, or to identify required improvements in algorithms and hardware. We present and discuss an experimental evaluation with DPM-Bench.

DPM-Bench: Benchmark for Distributed Process Mining Algorithms on Cyber-Physical Systems

TL;DR

The paper addresses the limitations of centralized process mining in CPS by proposing Distributed Process Mining (DPM) and a formal CPS-adapted streaming model. It introduces DPM-Bench, a benchmark framework that uses Hardware Interaction Instructions (HIIs) to quantify costs and supports topology-aware evaluation across central, decentralized, and distributed PM configurations, demonstrated on a three-node setup with generated distributed event data. Key contributions include a formal taxonomy of PM topologies, an extended streaming model for CPS, and a public benchmark tool to compare DPM algorithms and topologies, providing insights into algorithmic behavior and infrastructure requirements. The framework enables engineers to assess hardware and network provisioning for DPM deployments and guides future research toward privacy-preserving, scalable distributed process mining in CPS.

Abstract

Process Mining is established in research and industry systems to analyze and optimize processes based on event data from information systems. Within this work, we accomodate process mining techniques to Cyber-Physical Systems. To capture the distributed and heterogeneous characteristics of data, computational resources, and network communication in CPS, the todays process mining algorithms and techniques must be augmented. Specifically, there is a need for new Distributed Process Mining algorithms that enable computations to be performed directly on edge resources, eliminating the need for moving all data to central cloud systems. This paper introduces the DPM-Bench benchmark for comparing such Distributed Process Mining algorithms. DPM-Bench is used to compare algorithms deployed in different computational topologies. The results enable information system engineers to assess whether the existing infrastructure is sufficient to perform distributed process mining, or to identify required improvements in algorithms and hardware. We present and discuss an experimental evaluation with DPM-Bench.

Paper Structure

This paper contains 17 sections, 2 equations, 5 figures, 1 algorithm.

Figures (5)

  • Figure 1: Exemplary Distributed Event Log for Processes in a Smart Factory: Events are distributed across different departments and are hence localized.
  • Figure 2: Data flow and computing capabilities in the Edge-Cloud-Continuum
  • Figure 3: The three process mining topologies differ in computing and communication allocation, with Distributed PM leveraging data sources' inherent resources.
  • Figure 4: Quality measures derived from DPM-Bench by processed events. It captures significant changes between algorithms and topologies.
  • Figure 5: Resource Demand and Load Capacity in relation to the provisioned network capacity

Theorems & Definitions (7)

  • definition thmcounterdefinition: Event and Event Stream
  • definition thmcounterdefinition: Distributed Event Stream
  • definition thmcounterdefinition: Hardware Interaction Instruction
  • definition thmcounterdefinition: Computing Node
  • definition thmcounterdefinition: Processing Time
  • definition thmcounterdefinition: Resource Utilization
  • definition thmcounterdefinition: Scalability