Directly Follows Graphs Go Predictive Process Monitoring With Graph Neural Networks
Attila Lischka, Simon Rauch, Oliver Stritzel
TL;DR
The paper tackles predictive process monitoring by moving from traditional sequence-based representations to directly-follows-graph (DFG) representations processed with Graph Neural Networks (GNNs). It systematically compares six DFG representations across three GNN families (GCN, GAT, GREAT) and investigates single- and multi-graph variants to predict next activities and remaining times on multiple real-world datasets. Key contributions include a comprehensive design of DFG encodings tailored to different GNNs, an edge-based and multi-graph exploration to minimize information loss, and an empirical evaluation showing competitive performance against sequence-based baselines with potential advantages for long, looping processes. The work signals a shift toward graph-centric PPM and lays groundwork for future enhancements such as virtual nodes and synthetic log generation to further contrast graph-based and sequence-based approaches.
Abstract
In the past years, predictive process monitoring (PPM) techniques based on artificial neural networks have evolved as a method to monitor the future behavior of business processes. Existing approaches mostly focus on interpreting the processes as sequences, so-called traces, and feeding them to neural architectures designed to operate on sequential data such as recurrent neural networks (RNNs) or transformers. In this study, we investigate an alternative way to perform PPM: by transforming each process in its directly-follows-graph (DFG) representation we are able to apply graph neural networks (GNNs) for the prediction tasks. By this, we aim to develop models that are more suitable for complex processes that are long and contain an abundance of loops. In particular, we present different ways to create DFG representations depending on the particular GNN we use. The tested GNNs range from classical node-based to novel edge-based architectures. Further, we investigate the possibility of using multi-graphs. By these steps, we aim to design graph representations that minimize the information loss when transforming traces into graphs.
