Table of Contents
Fetching ...

TraCeR: Transformer-Based Competing Risk Analysis with Longitudinal Covariates

Maxmillan Ries, Sohan Seth

TL;DR

TraCeR tackles the challenge of incorporating longitudinal covariates into competing-risk survival analysis by introducing a transformer-based architecture with time-aware covariate embeddings and a factorized attention encoder. The model jointly learns time-varying covariate interactions and cause-specific hazards in discrete time, validated across four real-world datasets with a strong emphasis on calibration as well as discrimination. It optimizes a local, event-specific likelihood and demonstrates superior, well-calibrated risk estimates, addressing a key gap in prior survival modeling. The work advances clinical risk estimation by delivering robust, interpretable hazard predictions under censoring and competing risks, with strong calibration performance and insightful ablations highlighting the benefits of factorized temporal-covariate attention.

Abstract

Survival analysis is a critical tool for modeling time-to-event data. Recent deep learning-based models have reduced various modeling assumptions including proportional hazard and linearity. However, a persistent challenge remains in incorporating longitudinal covariates, with prior work largely focusing on cross-sectional features, and in assessing calibration of these models, with research primarily focusing on discrimination during evaluation. We introduce TraCeR, a transformer-based survival analysis framework for incorporating longitudinal covariates. Based on a factorized self-attention architecture, TraCeR estimates the hazard function from a sequence of measurements, naturally capturing temporal covariate interactions without assumptions about the underlying data-generating process. The framework is inherently designed to handle censored data and competing events. Experiments on multiple real-world datasets demonstrate that TraCeR achieves substantial and statistically significant performance improvements over state-of-the-art methods. Furthermore, our evaluation extends beyond discrimination metrics and assesses model calibration, addressing a key oversight in literature.

TraCeR: Transformer-Based Competing Risk Analysis with Longitudinal Covariates

TL;DR

TraCeR tackles the challenge of incorporating longitudinal covariates into competing-risk survival analysis by introducing a transformer-based architecture with time-aware covariate embeddings and a factorized attention encoder. The model jointly learns time-varying covariate interactions and cause-specific hazards in discrete time, validated across four real-world datasets with a strong emphasis on calibration as well as discrimination. It optimizes a local, event-specific likelihood and demonstrates superior, well-calibrated risk estimates, addressing a key gap in prior survival modeling. The work advances clinical risk estimation by delivering robust, interpretable hazard predictions under censoring and competing risks, with strong calibration performance and insightful ablations highlighting the benefits of factorized temporal-covariate attention.

Abstract

Survival analysis is a critical tool for modeling time-to-event data. Recent deep learning-based models have reduced various modeling assumptions including proportional hazard and linearity. However, a persistent challenge remains in incorporating longitudinal covariates, with prior work largely focusing on cross-sectional features, and in assessing calibration of these models, with research primarily focusing on discrimination during evaluation. We introduce TraCeR, a transformer-based survival analysis framework for incorporating longitudinal covariates. Based on a factorized self-attention architecture, TraCeR estimates the hazard function from a sequence of measurements, naturally capturing temporal covariate interactions without assumptions about the underlying data-generating process. The framework is inherently designed to handle censored data and competing events. Experiments on multiple real-world datasets demonstrate that TraCeR achieves substantial and statistically significant performance improvements over state-of-the-art methods. Furthermore, our evaluation extends beyond discrimination metrics and assesses model calibration, addressing a key oversight in literature.

Paper Structure

This paper contains 21 sections, 9 equations, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Architecture. The raw sequence of numerical and discrete covariates are encoded per time-step through two embedding paths, before missingness and time-decay is applied. The resulting embeddings $(\tau_i, D, d_{\text{emb}})$ are fed into factorized attention layers. The resulting combinatorial embedding is summarized through a learnable query and passed onto $K$ cause-specific subnetworks. Each subnetwork produces a sequence of $\lambda_j^k(\mathbf{X}_i)$ hazards.
  • Figure 2: Cause-specific calibration curves for the same test set by all methods on the MIMIC-IV and PBC2 dataset. $y$-axis is the IPCW-corrected true event rate; $x$-axis is the mean predicted CIF $\hat{F}_{t_i}^k(\mathbf{X}_i)$.