Table of Contents
Fetching ...

Towards modeling evolving longitudinal health trajectories with a transformer-based deep learning model

Hans Moen, Vishnu Raj, Andrius Vabalas, Markus Perola, Samuel Kaski, Andrea Ganna, Pekka Marttinen

TL;DR

This work proposes Evolve, a unidirectional Transformer with a causal attention mask to model evolving health trajectories in nationwide electronic health records. By training to predict the same forecast-interval labels at every time step, conditioned on the history up to that step, the model produces a trajectory of predictions rather than a single outcome. It also enables trajectory analysis through age embeddings and neighborhood dynamics, and detects trajectory-changing events via sigmoid shifts and embedding changes. Empirically, Evolve achieves performance on par with a bidirectional CLS model and competitive with XGBoost, while offering interpretable trajectory signals that could support continuous monitoring and early intervention in clinical practice.

Abstract

Health registers contain rich information about individuals' health histories. Here our interest lies in understanding how individuals' health trajectories evolve in a nationwide longitudinal dataset with coded features, such as clinical codes, procedures, and drug purchases. We introduce a straightforward approach for training a Transformer-based deep learning model in a way that lets us analyze how individuals' trajectories change over time. This is achieved by modifying the training objective and by applying a causal attention mask. We focus here on a general task of predicting the onset of a range of common diseases in a given future forecast interval. However, instead of providing a single prediction about diagnoses that could occur in this forecast interval, our approach enable the model to provide continuous predictions at every time point up until, and conditioned on, the time of the forecast period. We find that this model performs comparably to other models, including a bi-directional transformer model, in terms of basic prediction performance while at the same time offering promising trajectory modeling properties. We explore a couple of ways to use this model for analyzing health trajectories and aiding in early detection of events that forecast possible later disease onsets. We hypothesize that this method may be helpful in continuous monitoring of peoples' health trajectories and enabling interventions in ongoing health trajectories, as well as being useful in retrospective analyses.

Towards modeling evolving longitudinal health trajectories with a transformer-based deep learning model

TL;DR

This work proposes Evolve, a unidirectional Transformer with a causal attention mask to model evolving health trajectories in nationwide electronic health records. By training to predict the same forecast-interval labels at every time step, conditioned on the history up to that step, the model produces a trajectory of predictions rather than a single outcome. It also enables trajectory analysis through age embeddings and neighborhood dynamics, and detects trajectory-changing events via sigmoid shifts and embedding changes. Empirically, Evolve achieves performance on par with a bidirectional CLS model and competitive with XGBoost, while offering interpretable trajectory signals that could support continuous monitoring and early intervention in clinical practice.

Abstract

Health registers contain rich information about individuals' health histories. Here our interest lies in understanding how individuals' health trajectories evolve in a nationwide longitudinal dataset with coded features, such as clinical codes, procedures, and drug purchases. We introduce a straightforward approach for training a Transformer-based deep learning model in a way that lets us analyze how individuals' trajectories change over time. This is achieved by modifying the training objective and by applying a causal attention mask. We focus here on a general task of predicting the onset of a range of common diseases in a given future forecast interval. However, instead of providing a single prediction about diagnoses that could occur in this forecast interval, our approach enable the model to provide continuous predictions at every time point up until, and conditioned on, the time of the forecast period. We find that this model performs comparably to other models, including a bi-directional transformer model, in terms of basic prediction performance while at the same time offering promising trajectory modeling properties. We explore a couple of ways to use this model for analyzing health trajectories and aiding in early detection of events that forecast possible later disease onsets. We hypothesize that this method may be helpful in continuous monitoring of peoples' health trajectories and enabling interventions in ongoing health trajectories, as well as being useful in retrospective analyses.

Paper Structure

This paper contains 20 sections, 5 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Figure shows an example of two artificial persons' health progressions. The plots shows how the model's prediction probabilities -- sigmoid values -- for the various class labels (diseases to occur in the forecast interval) changes over time as they age. Here we have also included the sigmoids for the none class. The orange dots indicate the fraction of changes in the nearest neighborhood from one age to the next, calculated from age-wise embedding similarities to the $k$-nearest individuals.
  • Figure 2: The figure illustrates the model and its inputs: codes, ages, positional information (pos), and years to forecast (t2f). A decision layer with sigmoid activations enables the model to predict the targeted labels -- reflecting what will happen in the forecast interval -- at each input position.
  • Figure 3: The dataset was split into historical and forecast intervals and further split into train, validation, and test sets.
  • Figure 4: Average neighborhood change from one year to the next. The included individuals are mothers who did (red, target group TG), and did not (green, control group CG), lose their child at a specific age. Death at age 1, $n_{\mathrm{TG}}=27$ & $n_{\mathrm{CG}}=27$ (upper left); death at the age 2, $n_{\mathrm{TG}}=18$ & $n_{\mathrm{CG}}=18$ (upper right); death at age 3, $n_{\mathrm{TG}}=23$ & $n_{\mathrm{CG}}=23$ (lower left); death at age 4, $n_{\mathrm{TG}}=19$ & $n_{\mathrm{CG}}=19$ (lower right). Neighborhood size $k$ is set to 1000.
  • Figure 5: The figure shows the evolving age-wise similarities between the target individuals and the top $k=100$ most similar representatives from each diagnosis (class), calculated using the associated embedding representations from the Evolve model. This shows an alternative way of visualizing the health trajectories of the same artificial persons as in Figure \ref{['fig:sigmoid_and_nn_changes']} (Person A and Person B).
  • ...and 3 more figures