Identifying treatment response subgroups in observational time-to-event data

Vincent Jeanselme; Chang Ho Yoon; Fabian Falck; Brian Tom; Jessica Barrett

Identifying treatment response subgroups in observational time-to-event data

Vincent Jeanselme, Chang Ho Yoon, Fabian Falck, Brian Tom, Jessica Barrett

TL;DR

This work tackles the challenge of identifying patient subgroups with distinct treatment responses in observational time-to-event data. It introduces Causal Survival Clustering (CSC), a neural network–based framework that jointly learns latent subgroups and subgroup-specific survival distributions under treated and untreated conditions, using inverse propensity weighting to adjust for non-random treatment assignment. The approach formalizes Subgroup Average Treatment Effects (SATE) and leverages a monotonic survival network to flexibly model time-to-event outcomes without strong parametric assumptions. Through synthetic experiments and a SEER case study, CSC outperforms state-of-the-art baselines, particularly in observational settings, and provides data-driven guidance for selecting the number of subgroups via an elbow heuristic. The results offer a practical, hypothesis-generating tool to inform clinical trial design and treatment guidelines by revealing heterogeneous treatment responses across real-world patient populations.

Abstract

Identifying patient subgroups with different treatment responses is an important task to inform medical recommendations, guidelines, and the design of future clinical trials. Existing approaches for treatment effect estimation primarily rely on Randomised Controlled Trials (RCTs), which tend to feature more homogeneous patient groups, making them less relevant for uncovering subgroups in the population encountered in real-world clinical practice. Subgroup analyses established for RCTs suffer from significant statistical biases when applied to observational studies, which benefit from larger and more representative populations. Our work introduces a novel, outcome-guided, subgroup analysis strategy for identifying subgroups of treatment response in both RCTs and observational studies alike. It hence positions itself in-between individualised and average treatment effect estimation to uncover patient subgroups with distinct treatment responses, critical for actionable insights that may influence treatment guidelines. In experiments, our approach significantly outperforms the current state-of-the-art method for subgroup analysis in both randomised and observational treatment regimes.

Identifying treatment response subgroups in observational time-to-event data

TL;DR

Abstract

Paper Structure (43 sections, 15 equations, 9 figures, 7 tables)

This paper contains 43 sections, 15 equations, 9 figures, 7 tables.

Introduction
Method
Problem setup
Estimating the quantities of interest
Subgroup assignment.
Survival distributions.
Inverse propensity weighting.
Training procedure
Experimental analysis
Data generation
Empirical settings
Benchmark methods.
Evaluation.
Treatment effect recovery
Recovering the underlying number of subgroups.
...and 28 more sections

Figures (9)

Figure 1: Subgroup treatment effect discovery in time-to-event observational data. Our work aims to identify subgroups of patients with similar treatment responses to guide clinical practice and design clinical trials. Our method simultaneously models the treatment effect and identifies subgroups while addressing treatment non-randomisation and censoring.
Figure 2: Graphical representation between covariates ($X$), treatment ($A$) and outcomes ($T, D$). Realisations of dashed variables are unobserved, while $X$, $A$, $T$ and $D$ are observed.
Figure 3: Causal Survival Clustering (CSC) architecture. Latent parameter $u_k$ characterising the subgroup $k$ is inputted into the monotonic network $M$ to estimate the survival under both treatment regimes. $G$ assigns the probability of belonging to each subgroup given the patient's covariate(s) $x$. The network $W$ estimates the treatment propensity used to account for the treatment assignment bias.
Figure 4: Averaged negative log-likelihood across 5-fold cross-validation given the number of subgroups $K$ under the "Observational" treatment assignment with the shaded area representing 95% CI. The log-likelihood presents an elbow around the underlying number of subgroups.
Figure 5: Averaged treatment effect subgroups across 5-fold cross-validation observed in the Seer dataset with the shaded areas representing 95% CI.
...and 4 more figures

Identifying treatment response subgroups in observational time-to-event data

TL;DR

Abstract

Identifying treatment response subgroups in observational time-to-event data

Authors

TL;DR

Abstract

Table of Contents

Figures (9)