XTSFormer: Cross-Temporal-Scale Transformer for Irregular-Time Event Prediction in Clinical Applications
Tingsong Xiao, Zelin Xu, Wenchong He, Zhengkun Xiao, Yupu Zhang, Zibo Liu, Shigang Chen, My T. Thai, Jiang Bian, Parisa Rashidi, Zhe Jiang
TL;DR
XTSFormer tackles irregular-time event prediction in clinical EHR data by introducing a feature-based cycle-aware time positional encoding (FCPE) and a cross-temporal-scale transformer with a hierarchical time hierarchy. FCPE encodes time via learnable frequencies and feature-dependent intensities, underpinned by Bochner's theorem to produce a translation-invariant kernel, while the cross-scale attention confines keys to the same scale to achieve scalable multi-scale interactions. A bottom-up clustering-based time hierarchy defines temporal scales, enabling efficient cross-scale attention and reduced computation without sacrificing granularity. Empirical results on Medications, Providers, and MIMIC-IV datasets show superior predictive performance and favorable time costs, supported by ablations, sensitivity analyses, and interpretable case studies. This approach advances accurate, scalable, and interpretable modeling of de facto clinical care pathways, with direct implications for patient safety and decision support, and it can be extended to consecutive-event prediction in future work, guided by the unified loss combining time and type predictions: $\\mathcal{L} =(1-\\alpha)\\mathcal{L}_t+\\alpha\\mathcal{L}_p$.
Abstract
Adverse clinical events related to unsafe care are among the top ten causes of death in the U.S. Accurate modeling and prediction of clinical events from electronic health records (EHRs) play a crucial role in patient safety enhancement. An example is modeling de facto care pathways that characterize common step-by-step plans for treatment or care. However, clinical event data pose several unique challenges, including the irregularity of time intervals between consecutive events, the existence of cycles, periodicity, multi-scale event interactions, and the high computational costs associated with long event sequences. Existing neural temporal point processes (TPPs) methods do not effectively capture the multi-scale nature of event interactions, which is common in many real-world clinical applications. To address these issues, we propose the cross-temporal-scale transformer (XTSFormer), specifically designed for irregularly timed event data. Our model consists of two vital components: a novel Feature-based Cycle-aware Time Positional Encoding (FCPE) that adeptly captures the cyclical nature of time, and a hierarchical multi-scale temporal attention mechanism, where different temporal scales are determined by a bottom-up clustering approach. Extensive experiments on several real-world EHR datasets show that our XTSFormer outperforms multiple baseline methods. The code is available at https://github.com/spatialdatasciencegroup/XTSFormer.
