Object-Centric Event Logs: Specifications, Comparative Analysis and Refinement
Alexandre Goossens, Johannes De Smedt, Jan Vanthienen
TL;DR
The paper tackles the lack of standardization in object-centric event logs and the interoperability challenges across formats. It develops a four-dimensional specifications framework, drawn from process mining, object-centric modeling, and database storage, to compare formats along $E2E$, $O2O$, $E2O$, and Data Quality. By analyzing OCEL 1.0, OCEL 2.0, XOC, DOCEL, ACEL, EKG, and OCED, it reveals trade-offs in attribute changes, object relations, and event-object relations, showing that no format currently supports all three dimensions perfectly. It then proposes practical refinements for OCEL 2.0 (traceable dynamic attributes via foreign keys, reified object relations, and dynamic relations) to enhance scalability, traceability, and interoperability, aiming to move toward an Object-Centric XES standard.
Abstract
Process mining aims to comprehend and enhance business processes by analyzing event logs. Recently, object-centric process mining has gained traction by considering multiple objects interacting with each other in a process. This object-centric approach offers advantages over traditional methods by avoiding dimension reduction issues. However, in contrast to traditional process mining where a standard event log format was quickly agreed upon with XES providing a common platform for further research and industry, various object-centric logging formats have been proposed, each addressing specific challenges such as object relations or dynamic attribute changes. This makes that interoperability of object-centric algorithms remains a challenge, hindering reproducibility and generalizability in research. Additionally, the object-centric process storage paradigm aligns well with a wide range of object-oriented databases storing process data. This paper introduces a specifications framework from three perspectives originating from process mining (what should be analyzed), object-centric process modeling (how it should be modeled), and database storage (how it should be stored) perspectives in order to compare and evaluate object-centric log formats. By identifying commonalities and discrepancies among these formats, the study delves into unresolved issues and proposes potential solutions. Ultimately, this research contributes to advancing object-centric process mining by facilitating a deeper understanding of event log formats and promoting consistency and compatibility across methodologies.
