Table of Contents
Fetching ...

Developing a High-Performance Process Mining Library with Java and Python Bindings in Rust

Aaron Küsters, Wil M. P. van der Aalst

TL;DR

The paper tackles fragmentation and performance gaps between ProM (Java) and PM4Py (Python) by proposing a shared, high-performance library implemented in Rust with bindings to Java and Python. The core algorithms are implemented once in Rust and exposed through thin JNI and PyO3 bindings, with optional WebAssembly targets to broaden deployment. Empirical results show substantial speedups, including up to ~70x for Alpha+++ and up to ~5x for XES importing, illustrating the practicality of cross-language, high-performance process mining libraries. A starter kit is provided to bootstrap such cross-language bindings, demonstrating both feasibility and potential community impact for rapid algorithm development and dissemination.

Abstract

The most commonly used open-source process mining software tools today are ProM and PM4Py, written in Java and Python, respectively. Such high-level, often interpreted, programming languages trade off performance with memory safety and ease-of-use. In contrast, traditional compiled languages, like C or C++, can achieve top performance but often suffer from instability related to unsafe memory management. Lately, Rust emerged as a highly performant, compiled programming language with inherent memory safety. In this paper, we describe our approach to developing a shared process mining library in Rust with bindings to both Java and Python, allowing full integration into the existing ecosystems, like ProM and PM4Py. By facilitating interoperability, our methodology enables researchers or industry to develop novel algorithms in Rust once and make them accessible to the entire community while also achieving superior performance.

Developing a High-Performance Process Mining Library with Java and Python Bindings in Rust

TL;DR

The paper tackles fragmentation and performance gaps between ProM (Java) and PM4Py (Python) by proposing a shared, high-performance library implemented in Rust with bindings to Java and Python. The core algorithms are implemented once in Rust and exposed through thin JNI and PyO3 bindings, with optional WebAssembly targets to broaden deployment. Empirical results show substantial speedups, including up to ~70x for Alpha+++ and up to ~5x for XES importing, illustrating the practicality of cross-language, high-performance process mining libraries. A starter kit is provided to bootstrap such cross-language bindings, demonstrating both feasibility and potential community impact for rapid algorithm development and dissemination.

Abstract

The most commonly used open-source process mining software tools today are ProM and PM4Py, written in Java and Python, respectively. Such high-level, often interpreted, programming languages trade off performance with memory safety and ease-of-use. In contrast, traditional compiled languages, like C or C++, can achieve top performance but often suffer from instability related to unsafe memory management. Lately, Rust emerged as a highly performant, compiled programming language with inherent memory safety. In this paper, we describe our approach to developing a shared process mining library in Rust with bindings to both Java and Python, allowing full integration into the existing ecosystems, like ProM and PM4Py. By facilitating interoperability, our methodology enables researchers or industry to develop novel algorithms in Rust once and make them accessible to the entire community while also achieving superior performance.
Paper Structure (17 sections, 6 figures, 7 tables)

This paper contains 17 sections, 6 figures, 7 tables.

Figures (6)

  • Figure 1: The two main advantages of our proposed approach. The main algorithm (like Alpha+++ or XES parsing in this paper) is implemented only once in Rust. Java and Python bindings make this implementation available to established tool ecosystems (like ProM and PM4Py). Additionally, our evaluation indicates great potential speedups of Rust implementations, compared to baseline implementations in Java or Python.
  • Figure 2: Overview of the approach: A main implementation is written once in Rust. Thin wrappers for Java, Python, or other languages bind to the main implementation and expose functionality for easy use. Other programs (like a Java GUI program, a Python script) can make use of the exposed functionality.
  • Figure 3: Screenshot of the AlphaRevisitExperiments ProM plugin featuring a Mine Petri Net (in Rust) button for executing the Alpha+++ discovery in Rust instead of Java.
  • Figure 4: Performance comparison of XES import implementations. For all evaluated event logs, the rustxes performed better than the two other implementations.
  • Figure 5: Performance comparison of the Alpha+++ discovery algorithm between the initial Python and Java implementation and the efficient Rust re-implementation across different event logs.
  • ...and 1 more figures