Table of Contents
Fetching ...

Effect of a Process Mining based Pre-processing Step in Prediction of the Critical Health Outcomes

Negin Ashrafi, Armin Abdollahi, Greg Placencia, Maryam Pishgar

TL;DR

Predicting critical health outcomes such as mortality and hospital readmission is hampered by noisy, complex healthcare data. The authors propose a concatenation-based pre-processing step to reduce dataset complexity before applying process mining and predictive modeling. They convert 16 healthcare datasets from MIMIC III and a university hospital into event logs, apply concatenation, build process models with Split Miner, and forecast outcomes using the DREAM algorithm, evaluating with AUC and CIs. Results show that concatenation enhances both process model quality and predictive accuracy, suggesting practical benefits for early risk stratification in clinical settings.

Abstract

Predicting critical health outcomes such as patient mortality and hospital readmission is essential for improving survivability. However, healthcare datasets have many concurrences that create complexities, leading to poor predictions. Consequently, pre-processing the data is crucial to improve its quality. In this study, we use an existing pre-processing algorithm, concatenation, to improve data quality by decreasing the complexity of datasets. Sixteen healthcare datasets were extracted from two databases - MIMIC III and University of Illinois Hospital - converted to the event logs, they were then fed into the concatenation algorithm. The pre-processed event logs were then fed to the Split Miner (SM) algorithm to produce a process model. Process model quality was evaluated before and after concatenation using the following metrics: fitness, precision, F-Measure, and complexity. The pre-processed event logs were also used as inputs to the Decay Replay Mining (DREAM) algorithm to predict critical outcomes. We compared predicted results before and after applying the concatenation algorithm using Area Under the Curve (AUC) and Confidence Intervals (CI). Results indicated that the concatenation algorithm improved the quality of the process models and predictions of the critical health outcomes.

Effect of a Process Mining based Pre-processing Step in Prediction of the Critical Health Outcomes

TL;DR

Predicting critical health outcomes such as mortality and hospital readmission is hampered by noisy, complex healthcare data. The authors propose a concatenation-based pre-processing step to reduce dataset complexity before applying process mining and predictive modeling. They convert 16 healthcare datasets from MIMIC III and a university hospital into event logs, apply concatenation, build process models with Split Miner, and forecast outcomes using the DREAM algorithm, evaluating with AUC and CIs. Results show that concatenation enhances both process model quality and predictive accuracy, suggesting practical benefits for early risk stratification in clinical settings.

Abstract

Predicting critical health outcomes such as patient mortality and hospital readmission is essential for improving survivability. However, healthcare datasets have many concurrences that create complexities, leading to poor predictions. Consequently, pre-processing the data is crucial to improve its quality. In this study, we use an existing pre-processing algorithm, concatenation, to improve data quality by decreasing the complexity of datasets. Sixteen healthcare datasets were extracted from two databases - MIMIC III and University of Illinois Hospital - converted to the event logs, they were then fed into the concatenation algorithm. The pre-processed event logs were then fed to the Split Miner (SM) algorithm to produce a process model. Process model quality was evaluated before and after concatenation using the following metrics: fitness, precision, F-Measure, and complexity. The pre-processed event logs were also used as inputs to the Decay Replay Mining (DREAM) algorithm to predict critical outcomes. We compared predicted results before and after applying the concatenation algorithm using Area Under the Curve (AUC) and Confidence Intervals (CI). Results indicated that the concatenation algorithm improved the quality of the process models and predictions of the critical health outcomes.
Paper Structure (4 sections, 1 equation, 1 figure, 1 table)

This paper contains 4 sections, 1 equation, 1 figure, 1 table.

Figures (1)

  • Figure 1: An example figure.