Table of Contents
Fetching ...

Trajectory-informed graph-based clustering for longitudinal cancer subtyping

Lara Cavinato, Marco Rocchi, Luca Viganò, Francesca Ieva

TL;DR

A novel trajectory-informed clustering method for cancer subtyping that integrates multi-modal clinical data and longitudinal patient trajectories, and constructs a patient similarity graph that enables the identification of patient subgroups that are not only phenotypically and genotypically distinct but also aligned with patterns of disease progression.

Abstract

Cancer subtyping plays a crucial role in informing prognosis and guiding personalized treatment strategies. However, conventional subtyping approaches often rely on static, biopsy-derived scores that hardly capture the biological heterogeneity and temporal evolution of the disease. In this study, we propose a novel trajectory-informed clustering method for cancer subtyping that integrates multi-modal clinical data and longitudinal patient trajectories. Our method constructs a patient similarity graph using time-varying imaging-derived features, clinical covariates, and transitions among key clinical states such as therapy, surveillance, relapse, and death. This graph structure enables the identification of patient subgroups that are not only phenotypically and genotypically distinct but also aligned with patterns of disease progression. We position our approach within the landscape of existing subtyping methods and highlight its advantages in terms of temporal modeling and graph-based interpretability. Through simulation studies and application to a real world dataset of liver metastases, we demonstrate the ability of our framework to uncover clinically relevant subtypes with distinct prognostic trajectories. Our results underscore the potential of trajectory-informed clustering to enhance personalized oncology by bridging cross-sectional biomarkers with dynamic disease evolution.

Trajectory-informed graph-based clustering for longitudinal cancer subtyping

TL;DR

A novel trajectory-informed clustering method for cancer subtyping that integrates multi-modal clinical data and longitudinal patient trajectories, and constructs a patient similarity graph that enables the identification of patient subgroups that are not only phenotypically and genotypically distinct but also aligned with patterns of disease progression.

Abstract

Cancer subtyping plays a crucial role in informing prognosis and guiding personalized treatment strategies. However, conventional subtyping approaches often rely on static, biopsy-derived scores that hardly capture the biological heterogeneity and temporal evolution of the disease. In this study, we propose a novel trajectory-informed clustering method for cancer subtyping that integrates multi-modal clinical data and longitudinal patient trajectories. Our method constructs a patient similarity graph using time-varying imaging-derived features, clinical covariates, and transitions among key clinical states such as therapy, surveillance, relapse, and death. This graph structure enables the identification of patient subgroups that are not only phenotypically and genotypically distinct but also aligned with patterns of disease progression. We position our approach within the landscape of existing subtyping methods and highlight its advantages in terms of temporal modeling and graph-based interpretability. Through simulation studies and application to a real world dataset of liver metastases, we demonstrate the ability of our framework to uncover clinically relevant subtypes with distinct prognostic trajectories. Our results underscore the potential of trajectory-informed clustering to enhance personalized oncology by bridging cross-sectional biomarkers with dynamic disease evolution.
Paper Structure (40 sections, 62 equations, 15 figures, 10 tables, 1 algorithm)

This paper contains 40 sections, 62 equations, 15 figures, 10 tables, 1 algorithm.

Figures (15)

  • Figure 1: Example of a clock-reset multi-state model (MSM) representing the natural history and clinical management of cancer. The states reflect key phases of disease progression: Healthy represents the absence of malignancy; Cancer precursor indicates a subclinical or pre-malignant condition; Diagnosis marks the point of clinical detection; Therapy includes the treatment phase such as surgery, chemotherapy, or radiotherapy; Recurrence denotes relapse after treatment; and Death is the absorbing terminal state. Arrows represent possible transitions between states, and time is reset at each state transition, consistent with a clock-reset MSM framework. This structure allows for modelling heterogeneous disease trajectories and estimating transition hazards between clinically relevant states.
  • Figure 2: Log-log plot of the execution time of the proposed algorithm as a function of the number of individuals in the study, ranging from 100 to 800. Data points represent average execution times measured over 30 runs. The linear trend on the log-log scale suggests scalability characteristics of the algorithm across varying population sizes.
  • Figure 3: Schematic illustration of the clinical timeline for a representative patient in the cohort. (1) Baseline CT scan. (2) Chemotherapy. (3) Post-treatment CT scan. (4) Surgical resection. (5) Follow-up for relapse or death.
  • Figure 4: Schematic illustration of the embedding process for categorical variables using a Word2Vec-inspired Continuous Bag-of-Words model.
  • Figure 5: Schematic representation of the three clock-reset multi-state models (MSMs) used in this study. Each model encodes different assumptions about the timing and sequence of clinical transitions: Model 1 captures baseline to therapy to death; Model 2 includes relapse as an intermediate state before death; Model 3 allows for competing relapse vs death trajectories.
  • ...and 10 more figures