Table of Contents
Fetching ...

A TEE-based Approach for Preserving Data Secrecy in Process Mining with Decentralized Sources

Davide Basile, Valerio Goretti, Luca Barbaro, Hajo A. Reijers, Claudio Di Ciccio

TL;DR

CONFINE addresses the challenge of preserving data secrecy in inter-organizational process mining by leveraging confidential computing with Trusted Execution Environments. It introduces a four-phase protocol and a segmentation-based data provisioning scheme enabling secure, in-TEE computation on unmodified multi-party event logs, with formal verification of correctness and security properties. The authors implement a SGX-based prototype integrating Heuristics Miner and Declare Conformance, and demonstrate memory scalability (logarithmic growth with log size) and linear scalability with the number of participating organizations. Empirical results show convergence between CONFINE-derived results and stand-alone mining, and provide insights into memory usage, segmentation strategies, and scalability, highlighting practical viability and remaining optimization opportunities. Overall, CONFINE offers a practical framework for privacy-preserving cross-organizational process analytics with formal guarantees and real-world applicability across healthcare, supply chains, and manufacturing.

Abstract

Process mining techniques enable organizations to gain insights into their business processes through the analysis of execution records (event logs) stored by information systems. While most process mining efforts focus on intra-organizational scenarios, many real-world business processes span multiple independent organizations. Inter-organizational process mining, though, faces significant challenges, particularly regarding confidentiality guarantees: The analysis of data can reveal information that the participating organizations may not consent to disclose to one another, or to a third party hosting process mining services. To overcome this issue, this paper presents CONFINE, an approach for secrecy-preserving inter-organizational process mining. CONFINE leverages Trusted Execution Environments (TEEs) to deploy trusted applications that are capable of securely mining multi-party event logs while preserving data secrecy. We propose an architecture supporting a four-stage protocol to secure data exchange and processing, allowing for protected transfer and aggregation of unaltered process data across organizational boundaries. To avoid out-of-memory errors due to the limited capacity of TEEs, our protocol employs a segmentation-based strategy, whereby event logs are transmitted to TEEs in smaller batches. We conduct a formal verification of correctness and a security analysis of the guarantees provided by the TEE core. We evaluate our implementation on real-world and synthetic data, showing that the proposed approach can handle realistic workloads. The results indicate logarithmic memory growth with respect to the event log size and linear growth with the number of provisioning organizations, highlighting scalability properties and opportunities for further optimization.

A TEE-based Approach for Preserving Data Secrecy in Process Mining with Decentralized Sources

TL;DR

CONFINE addresses the challenge of preserving data secrecy in inter-organizational process mining by leveraging confidential computing with Trusted Execution Environments. It introduces a four-phase protocol and a segmentation-based data provisioning scheme enabling secure, in-TEE computation on unmodified multi-party event logs, with formal verification of correctness and security properties. The authors implement a SGX-based prototype integrating Heuristics Miner and Declare Conformance, and demonstrate memory scalability (logarithmic growth with log size) and linear scalability with the number of participating organizations. Empirical results show convergence between CONFINE-derived results and stand-alone mining, and provide insights into memory usage, segmentation strategies, and scalability, highlighting practical viability and remaining optimization opportunities. Overall, CONFINE offers a practical framework for privacy-preserving cross-organizational process analytics with formal guarantees and real-world applicability across healthcare, supply chains, and manufacturing.

Abstract

Process mining techniques enable organizations to gain insights into their business processes through the analysis of execution records (event logs) stored by information systems. While most process mining efforts focus on intra-organizational scenarios, many real-world business processes span multiple independent organizations. Inter-organizational process mining, though, faces significant challenges, particularly regarding confidentiality guarantees: The analysis of data can reveal information that the participating organizations may not consent to disclose to one another, or to a third party hosting process mining services. To overcome this issue, this paper presents CONFINE, an approach for secrecy-preserving inter-organizational process mining. CONFINE leverages Trusted Execution Environments (TEEs) to deploy trusted applications that are capable of securely mining multi-party event logs while preserving data secrecy. We propose an architecture supporting a four-stage protocol to secure data exchange and processing, allowing for protected transfer and aggregation of unaltered process data across organizational boundaries. To avoid out-of-memory errors due to the limited capacity of TEEs, our protocol employs a segmentation-based strategy, whereby event logs are transmitted to TEEs in smaller batches. We conduct a formal verification of correctness and a security analysis of the guarantees provided by the TEE core. We evaluate our implementation on real-world and synthetic data, showing that the proposed approach can handle realistic workloads. The results indicate logarithmic memory growth with respect to the event log size and linear growth with the number of provisioning organizations, highlighting scalability properties and opportunities for further optimization.
Paper Structure (33 sections, 4 theorems, 12 figures, 7 tables)

This paper contains 33 sections, 4 theorems, 12 figures, 7 tables.

Key Result

Proposition 4.1

Let $\mathrm{L}'$, $\mathrm{L}"$, $\mathrm{L}"'$ be three event logs. If $\bigoplus$ is a safe merge, it enjoys the properties of

Figures (12)

  • Figure 1: A BPMN collaboration diagram of a simplified healthcare scenario
  • Figure 2: The CONFINE high-level architecture
  • Figure 3: The Secure Miner
  • Figure 4: UML deployment diagram of the CONFINE architecture
  • Figure 5: Phases of the CONFINE protocol with references to the related sections, figures, and algorithms in this article
  • ...and 7 more figures

Theorems & Definitions (16)

  • definition 1: Event
  • definition 2: Event log
  • definition 3: Process mining function
  • definition 4: Provisioner
  • definition 5: Log partition
  • definition 6: Segment
  • definition 7: Case
  • definition 8: Merge
  • Proposition 4.1
  • proof
  • ...and 6 more