Developing a High-Performance Process Mining Library with Java and Python Bindings in Rust
Aaron Küsters, Wil M. P. van der Aalst
TL;DR
The paper tackles fragmentation and performance gaps between ProM (Java) and PM4Py (Python) by proposing a shared, high-performance library implemented in Rust with bindings to Java and Python. The core algorithms are implemented once in Rust and exposed through thin JNI and PyO3 bindings, with optional WebAssembly targets to broaden deployment. Empirical results show substantial speedups, including up to ~70x for Alpha+++ and up to ~5x for XES importing, illustrating the practicality of cross-language, high-performance process mining libraries. A starter kit is provided to bootstrap such cross-language bindings, demonstrating both feasibility and potential community impact for rapid algorithm development and dissemination.
Abstract
The most commonly used open-source process mining software tools today are ProM and PM4Py, written in Java and Python, respectively. Such high-level, often interpreted, programming languages trade off performance with memory safety and ease-of-use. In contrast, traditional compiled languages, like C or C++, can achieve top performance but often suffer from instability related to unsafe memory management. Lately, Rust emerged as a highly performant, compiled programming language with inherent memory safety. In this paper, we describe our approach to developing a shared process mining library in Rust with bindings to both Java and Python, allowing full integration into the existing ecosystems, like ProM and PM4Py. By facilitating interoperability, our methodology enables researchers or industry to develop novel algorithms in Rust once and make them accessible to the entire community while also achieving superior performance.
