Table of Contents
Fetching ...

Explainable AI for Infection Prevention and Control: Modeling CPE Acquisition and Patient Outcomes in an Irish Hospital with Transformers

Minh-Khoi Pham, Tai Tan Mai, Martin Crane, Rob Brennan, Marie E. Ward, Una Geary, Declan Byrne, Brian O Connell, Colm Bergin, Donncha Creagh, Nick McDonald, Marija Bezbradica

TL;DR

This study demonstrates how Transformer-based models can be applied to real-world EMR data from an Irish hospital to predict CPE acquisition and related patient outcomes, all under an explainable AI framework. By integrating demographics, past medical history, ward trajectories, CPE screening results, and ward-level contact networks, the authors show that TabTransformer generally outperforms traditional baselines and provides interpretable attributions via Integrated Gradients. Global and local explanations reveal that while CPE codes contribute modestly to outcome predictions, factors such as admission ward, area of residence, historical admissions, and network centrality are influential for CPE risk. The work highlights practical IPC implications, offering a pathway for risk scoring and targeted screening, and underscores the need for richer, multi-institutional data to improve generalizability and precision in imbalanced clinical settings.

Abstract

Carbapenemase-Producing Enterobacteriace poses a critical concern for infection prevention and control in hospitals. However, predictive modeling of previously highlighted CPE-associated risks such as readmission, mortality, and extended length of stay (LOS) remains underexplored, particularly with modern deep learning approaches. This study introduces an eXplainable AI modeling framework to investigate CPE impact on patient outcomes from Electronic Medical Records data of an Irish hospital. We analyzed an inpatient dataset from an Irish acute hospital, incorporating diagnostic codes, ward transitions, patient demographics, infection-related variables and contact network features. Several Transformer-based architectures were benchmarked alongside traditional machine learning models. Clinical outcomes were predicted, and XAI techniques were applied to interpret model decisions. Our framework successfully demonstrated the utility of Transformer-based models, with TabTransformer consistently outperforming baselines across multiple clinical prediction tasks, especially for CPE acquisition (AUROC and sensitivity). We found infection-related features, including historical hospital exposure, admission context, and network centrality measures, to be highly influential in predicting patient outcomes and CPE acquisition risk. Explainability analyses revealed that features like "Area of Residence", "Admission Ward" and prior admissions are key risk factors. Network variables like "Ward PageRank" also ranked highly, reflecting the potential value of structural exposure information. This study presents a robust and explainable AI framework for analyzing complex EMR data to identify key risk factors and predict CPE-related outcomes. Our findings underscore the superior performance of the Transformer models and highlight the importance of diverse clinical and network features.

Explainable AI for Infection Prevention and Control: Modeling CPE Acquisition and Patient Outcomes in an Irish Hospital with Transformers

TL;DR

This study demonstrates how Transformer-based models can be applied to real-world EMR data from an Irish hospital to predict CPE acquisition and related patient outcomes, all under an explainable AI framework. By integrating demographics, past medical history, ward trajectories, CPE screening results, and ward-level contact networks, the authors show that TabTransformer generally outperforms traditional baselines and provides interpretable attributions via Integrated Gradients. Global and local explanations reveal that while CPE codes contribute modestly to outcome predictions, factors such as admission ward, area of residence, historical admissions, and network centrality are influential for CPE risk. The work highlights practical IPC implications, offering a pathway for risk scoring and targeted screening, and underscores the need for richer, multi-institutional data to improve generalizability and precision in imbalanced clinical settings.

Abstract

Carbapenemase-Producing Enterobacteriace poses a critical concern for infection prevention and control in hospitals. However, predictive modeling of previously highlighted CPE-associated risks such as readmission, mortality, and extended length of stay (LOS) remains underexplored, particularly with modern deep learning approaches. This study introduces an eXplainable AI modeling framework to investigate CPE impact on patient outcomes from Electronic Medical Records data of an Irish hospital. We analyzed an inpatient dataset from an Irish acute hospital, incorporating diagnostic codes, ward transitions, patient demographics, infection-related variables and contact network features. Several Transformer-based architectures were benchmarked alongside traditional machine learning models. Clinical outcomes were predicted, and XAI techniques were applied to interpret model decisions. Our framework successfully demonstrated the utility of Transformer-based models, with TabTransformer consistently outperforming baselines across multiple clinical prediction tasks, especially for CPE acquisition (AUROC and sensitivity). We found infection-related features, including historical hospital exposure, admission context, and network centrality measures, to be highly influential in predicting patient outcomes and CPE acquisition risk. Explainability analyses revealed that features like "Area of Residence", "Admission Ward" and prior admissions are key risk factors. Network variables like "Ward PageRank" also ranked highly, reflecting the potential value of structural exposure information. This study presents a robust and explainable AI framework for analyzing complex EMR data to identify key risk factors and predict CPE-related outcomes. Our findings underscore the superior performance of the Transformer models and highlight the importance of diverse clinical and network features.

Paper Structure

This paper contains 37 sections, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Distribution of the two studied patient cohorts based on CPE screening status, whether it is positive (CPE+) or negative (CPE-), : CPE-, : CPE+
  • Figure 2: Overview of the proposed framework for processing and modeling structured EMR data. The pipeline integrates demographic data, ward history, diagnosis codes, and contact network features into Transformer-based predictive models.
  • Figure 3: Impact of CPE-related ICD-10CM diagnosis codes on patient outcome modeling using IG.
  • Figure 4: Feature importance rankings based on Integrated Gradients for each prediction task. Shown in a), b) and c) are network-related and infection-related features (definitions can be found in Appendix). Whereas, d) demonstrates top predictive features of CPE acquisition from our benchmarked models. Lower ranks indicate higher importance. : Demographics, : Network features, : Past Episode Features, : Current Episode Features
  • Figure 5: t-SNE projection of learned patient episode embeddings from TabTransformer. Colors denote CPE status: CPE-, CPE+.
  • ...and 2 more figures