Early Detection of Multidrug Resistance Using Multivariate Time Series Analysis and Interpretable Patient-Similarity Representations
Óscar Escudero-Arnanz, Antonio G. Marques, Inmaculada Mora-Jiménez, Joaquín Álvarez-Rodríguez, Cristina Soguero-Ruiz
TL;DR
This paper tackles the problem of early multidrug resistance (MDR) detection in ICU patients by combining multivariate time series (MTS) representations with interpretable, graph-based analyses. It introduces a framework that computes patient-to-patient similarity via FE, Dynamic Time Warping (DTW), and Time Cluster Kernel (TCK), followed by dimensionality reduction and simple classifiers (LR, RF, SVM) to predict MDR, while also constructing similarity graphs and clustering structures for interpretability. Key contributions include (i) a robust, interpretable MTS-based pipeline achieving ROC-AUC up to ~81% on ICU EHR data, (ii) a graph- and cluster-based knowledge extraction approach that reveals clinically meaningful MDR patterns and risk factors, and (iii) open-source code and a validation protocol enabling replication and extension. The framework supports early detection and risk factor identification, offering practical value for critical care and a foundation for applying explainable ML to similar clinical time-series problems across institutions and conditions.
Abstract
Background and Objectives: Multidrug Resistance (MDR) is a critical global health issue, causing increased hospital stays, healthcare costs, and mortality. This study proposes an interpretable Machine Learning (ML) framework for MDR prediction, aiming for both accurate inference and enhanced explainability. Methods: Patients are modeled as Multivariate Time Series (MTS), capturing clinical progression and patient-to-patient interactions. Similarity among patients is quantified using MTS-based methods: descriptive statistics, Dynamic Time Warping, and Time Cluster Kernel. These similarity measures serve as inputs for MDR classification via Logistic Regression, Random Forest, and Support Vector Machines, with dimensionality reduction and kernel transformations improving model performance. For explainability, patient similarity networks are constructed from these metrics. Spectral clustering and t-SNE are applied to identify MDR-related subgroups and visualize high-risk clusters, enabling insight into clinically relevant patterns. Results: The framework was validated on ICU Electronic Health Records from the University Hospital of Fuenlabrada, achieving an AUC of 81%. It outperforms baseline ML and deep learning models by leveraging graph-based patient similarity. The approach identifies key risk factors -- prolonged antibiotic use, invasive procedures, co-infections, and extended ICU stays -- and reveals clinically meaningful clusters. Code and results are available at \https://github.com/oscarescuderoarnanz/DM4MTS. Conclusions: Patient similarity representations combined with graph-based analysis provide accurate MDR prediction and interpretable insights. This method supports early detection, risk factor identification, and patient stratification, highlighting the potential of explainable ML in critical care.
