TransformerLSR: Attentive Joint Model of Longitudinal Data, Survival, and Recurrent Events with Concurrent Latent Structure

Zhiyue Zhang; Yao Zhao; Yanxun Xu

TransformerLSR: Attentive Joint Model of Longitudinal Data, Survival, and Recurrent Events with Concurrent Latent Structure

Zhiyue Zhang, Yao Zhao, Yanxun Xu

TL;DR

TransformerLSR introduces a continuous-time Transformer framework to jointly model longitudinal measurements, recurrent events, and survival. It uses a novel trajectory representation and a causal mask to encode known clinical structure, while modeling recurrent and survival events with deep temporal point processes in an encoder–decoder architecture. The method is trained with a composite loss combining longitudinal prediction, event likelihood, and survival likelihood, and uses Monte Carlo estimation and thinning for efficient inference. Across simulations and a DIVAT kidney transplantation dataset, TransformerLSR achieves superior or competitive performance in predicting longitudinal outcomes, estimating event intensities, and projecting survival, with scalable handling of asynchronous data and missingness. These capabilities support personalized, time-aware decision-making in complex biomedical settings where multiple data streams interact over continuous time.

Abstract

In applications such as biomedical studies, epidemiology, and social sciences, recurrent events often co-occur with longitudinal measurements and a terminal event, such as death. Therefore, jointly modeling longitudinal measurements, recurrent events, and survival data while accounting for their dependencies is critical. While joint models for the three components exist in statistical literature, many of these approaches are limited by heavy parametric assumptions and scalability issues. Recently, incorporating deep learning techniques into joint modeling has shown promising results. However, current methods only address joint modeling of longitudinal measurements at regularly-spaced observation times and survival events, neglecting recurrent events. In this paper, we develop TransformerLSR, a flexible transformer-based deep modeling and inference framework to jointly model all three components simultaneously. TransformerLSR integrates deep temporal point processes into the joint modeling framework, treating recurrent and terminal events as two competing processes dependent on past longitudinal measurements and recurrent event times. Additionally, TransformerLSR introduces a novel trajectory representation and model architecture to potentially incorporate a priori knowledge of known latent structures among concurrent longitudinal variables. We demonstrate the effectiveness and necessity of TransformerLSR through simulation studies and analyzing a real-world medical dataset on patients after kidney transplantation.

TransformerLSR: Attentive Joint Model of Longitudinal Data, Survival, and Recurrent Events with Concurrent Latent Structure

TL;DR

Abstract

Paper Structure (17 sections, 11 equations, 7 figures, 2 tables, 1 algorithm)

This paper contains 17 sections, 11 equations, 7 figures, 2 tables, 1 algorithm.

Introduction
Methods
Model Architecture
Trajectory Representation and Causal Mask Incorporating Known Clinical Knowledge
Encoder
Decoder
Training
Inference
Simulation study
Simulation setup
Simulation results
Application to DIVAT
Conclusion
Asynchronous missing data
Likelihood functions
...and 2 more sections

Figures (7)

Figure 1: Architecture of TransformerLSR, layer normalization/residual connection omitted for clarity. The encoder processes the input patient history $\mathcal{H}_j$, and feeds to the encoder to output the event intensity $\lambda(t_j+\tau)$ and hazard $h(t_j+\tau)$ after lag time $\tau$. The longitudinal variables $Y_{1:m}(t_j+\tau)$ are predicted autoregressively, where each output $\widehat{Y}_u(t_j+\tau)$ is fed back to the decoder input for the prediction of subsequent variables.
Figure 2: Mean survival curves under three different censoring distributions.
Figure 3: Sample log recurrent event intensity function compared with the ground truth under the three settings.
Figure 4: Dynamic prediction case studies for two patients from the DIVAT kidney transplantation dataset. Top panel: left, prediction of log creatinine level at the next visit vs. actual observation; middle, prediction of log tacrolimus dosage at the next vs. actual assignment; right, predicted log recurrent event intensity at each clinic visit. Bottom panel: left, conditional survival function given the history of the first five clinic visits; right, conditional survival function given the history up to the sixth last visit.
Figure 5: Rollout case studies for two patients from the DIVAT kidney transplantation dataset. Left panel: predicted creatinine level and tacrolimus dosage based on their previous observed values. Right panel: fitted hazard function and recurrent event intensity values at the observed visit times and predicted next visit time.
...and 2 more figures

Theorems & Definitions (1)

proof

TransformerLSR: Attentive Joint Model of Longitudinal Data, Survival, and Recurrent Events with Concurrent Latent Structure

TL;DR

Abstract

TransformerLSR: Attentive Joint Model of Longitudinal Data, Survival, and Recurrent Events with Concurrent Latent Structure

Authors

TL;DR

Abstract

Table of Contents

Figures (7)

Theorems & Definitions (1)