Leveraging Duration Pseudo-Embeddings in Multilevel LSTM and GCN Hypermodels for Outcome-Oriented PPM
Fang Wang, Paolo Ceravolo, Ernesto Damiani
TL;DR
Experiments on balanced and imbalanced outcome prediction tasks show that duration pseudo-embedding inputs consistently improve generalization, reduce model complexity, and enhance interpretability, and the benefits of explicit temporal encoding are demonstrated.
Abstract
Existing deep learning models for Predictive Process Monitoring (PPM) struggle with temporal irregularities, particularly stochastic event durations and overlapping timestamps, limiting their adaptability across heterogeneous datasets. We propose a dual input neural network strategy that separates event and sequence attributes, using a duration-aware pseudo-embedding matrix to transform temporal importance into compact, learnable representations. This design is implemented across two baseline families: B-LSTM and B-GCN, and their duration-aware variants D-LSTM and D-GCN. All models incorporate self-tuned hypermodels for adaptive architecture selection. Experiments on balanced and imbalanced outcome prediction tasks show that duration pseudo-embedding inputs consistently improve generalization, reduce model complexity, and enhance interpretability. Our results demonstrate the benefits of explicit temporal encoding and provide a flexible design for robust, real-world PPM applications.
