Clustering of Disease Trajectories with Explainable Machine Learning: A Case Study on Postoperative Delirium Phenotypes
Xiaochen Zheng, Manuel Schürch, Xingyu Chen, Maria Angeliki Komninou, Reto Schüpbach, Ahmed Allam, Jan Bartussek, Michael Krauthammer
TL;DR
The paper addresses the heterogeneity of postoperative delirium (POD) by proposing a framework that combines supervised POD risk prediction with unsupervised clustering in the SHAP feature-importance space to uncover latent POD phenotypes. It validates the approach first with synthetic data, where predefined phenotypes are recoverable by clustering in SHAP-space rather than in the raw feature space, demonstrating improved phenotypic recovery. A real-world case study in elderly surgical patients shows clinically relevant POD subtypes emerge from this SHAP-guided clustering, supporting precision prevention and targeted treatment. The work also emphasizes explainability and label reliability, showing how SHAP-based feature importance helps identify meaningful phenotypes and highlighting labeling noise in ICU and post-discharge delirium documentation as a critical factor for future improvement.
Abstract
The identification of phenotypes within complex diseases or syndromes is a fundamental component of precision medicine, which aims to adapt healthcare to individual patient characteristics. Postoperative delirium (POD) is a complex neuropsychiatric condition with significant heterogeneity in its clinical manifestations and underlying pathophysiology. We hypothesize that POD comprises several distinct phenotypes, which cannot be directly observed in clinical practice. Identifying these phenotypes could enhance our understanding of POD pathogenesis and facilitate the development of targeted prevention and treatment strategies. In this paper, we propose an approach that combines supervised machine learning for personalized POD risk prediction with unsupervised clustering techniques to uncover potential POD phenotypes. We first demonstrate our approach using synthetic data, where we simulate patient cohorts with predefined phenotypes based on distinct sets of informative features. We aim to mimic any clinical disease with our synthetic data generation method. By training a predictive model and applying SHAP, we show that clustering patients in the SHAP feature importance space successfully recovers the true underlying phenotypes, outperforming clustering in the raw feature space. We then present a case study using real-world data from a cohort of elderly surgical patients. The results showcase the utility of our approach in uncovering clinically relevant subtypes of complex disorders like POD, paving the way for more precise and personalized treatment strategies.
