Table of Contents
Fetching ...

Auto-ML Graph Neural Network Hypermodels for Outcome Prediction in Event-Sequence Data

Fang Wang, Lance Kosca, Adrienne Kosca, Marko Gacesa, Ernesto Damiani

TL;DR

This work tackles outcome prediction from event-sequence data by introducing HGNN(O), an AutoML-GNN hypermodel that unifies four architectures with six GNN operators and employs Bayesian optimization with pruning and early stopping for automatic configuration search. It encodes traces as temporally weighted graphs and evaluates across balanced and imbalanced logs, achieving accuracy above 0.98 on Traffic Fines and strong weighted F1 on Patients without explicit imbalance handling. The study demonstrates that automated architecture/hyperparameter search coupled with graph-based representations yields robust, generalizable predictive models for predictive business process monitoring. It also identifies GINConv as a consistently reliable operator, especially when paired with activity-embedding strategies in imbalanced scenarios, pointing to practical guidance for deployment and future enhancements in interpretability and imbalance-aware training.

Abstract

This paper introduces HGNN(O), an AutoML GNN hypermodel framework for outcome prediction on event-sequence data. Building on our earlier work on graph convolutional network hypermodels, HGNN(O) extends four architectures-One Level, Two Level, Two Level Pseudo Embedding, and Two Level Embedding-across six canonical GNN operators. A self-tuning mechanism based on Bayesian optimization with pruning and early stopping enables efficient adaptation over architectures and hyperparameters without manual configuration. Empirical evaluation on both balanced and imbalanced event logs shows that HGNN(O) achieves accuracy exceeding 0.98 on the Traffic Fines dataset and weighted F1 scores up to 0.86 on the Patients dataset without explicit imbalance handling. These results demonstrate that the proposed AutoML-GNN approach provides a robust and generalizable benchmark for outcome prediction in complex event-sequence data.

Auto-ML Graph Neural Network Hypermodels for Outcome Prediction in Event-Sequence Data

TL;DR

This work tackles outcome prediction from event-sequence data by introducing HGNN(O), an AutoML-GNN hypermodel that unifies four architectures with six GNN operators and employs Bayesian optimization with pruning and early stopping for automatic configuration search. It encodes traces as temporally weighted graphs and evaluates across balanced and imbalanced logs, achieving accuracy above 0.98 on Traffic Fines and strong weighted F1 on Patients without explicit imbalance handling. The study demonstrates that automated architecture/hyperparameter search coupled with graph-based representations yields robust, generalizable predictive models for predictive business process monitoring. It also identifies GINConv as a consistently reliable operator, especially when paired with activity-embedding strategies in imbalanced scenarios, pointing to practical guidance for deployment and future enhancements in interpretability and imbalance-aware training.

Abstract

This paper introduces HGNN(O), an AutoML GNN hypermodel framework for outcome prediction on event-sequence data. Building on our earlier work on graph convolutional network hypermodels, HGNN(O) extends four architectures-One Level, Two Level, Two Level Pseudo Embedding, and Two Level Embedding-across six canonical GNN operators. A self-tuning mechanism based on Bayesian optimization with pruning and early stopping enables efficient adaptation over architectures and hyperparameters without manual configuration. Empirical evaluation on both balanced and imbalanced event logs shows that HGNN(O) achieves accuracy exceeding 0.98 on the Traffic Fines dataset and weighted F1 scores up to 0.86 on the Patients dataset without explicit imbalance handling. These results demonstrate that the proposed AutoML-GNN approach provides a robust and generalizable benchmark for outcome prediction in complex event-sequence data.

Paper Structure

This paper contains 17 sections, 6 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Accuracy Heatmap from Training on Traffic Fines Balanced Dataset.
  • Figure 2: F1 Score Heatmap from Training on Patients Imbalanced Dataset.
  • Figure 3: F1 Score Radar Charts showing predictive performance of each GNN.