Table of Contents
Fetching ...

A Bayesian likely responder approach for the analysis of randomized controlled trials

Annan Deng, Carole Siegel, Hyung G. Park

Abstract

An important goal of precision medicine is to personalize medical treatment by identifying individuals who are most likely to benefit from a specific treatment. The Likely Responder (LR) framework, which identifies a subpopulation where treatment response is expected to exceed a certain clinical threshold, plays a role in this effort. However, the LR framework, and more generally, data-driven subgroup analyses, often fail to account for uncertainty in the estimation of model-based data-driven subgrouping. We propose a simple two-stage approach that integrates subgroup identification with subsequent subgroup-specific inference on treatment effects. We incorporate model estimation uncertainty from the first stage into subgroup-specific treatment effect estimation in the second stage, by utilizing Bayesian posterior distributions from the first stage. We evaluate our method through simulations, demonstrating that the proposed Bayesian two-stage model produces better calibrated confidence intervals than naïve approaches. We apply our method to an international COVID-19 treatment trial, which shows substantial variation in treatment effects across data-driven subgroups.

A Bayesian likely responder approach for the analysis of randomized controlled trials

Abstract

An important goal of precision medicine is to personalize medical treatment by identifying individuals who are most likely to benefit from a specific treatment. The Likely Responder (LR) framework, which identifies a subpopulation where treatment response is expected to exceed a certain clinical threshold, plays a role in this effort. However, the LR framework, and more generally, data-driven subgroup analyses, often fail to account for uncertainty in the estimation of model-based data-driven subgrouping. We propose a simple two-stage approach that integrates subgroup identification with subsequent subgroup-specific inference on treatment effects. We incorporate model estimation uncertainty from the first stage into subgroup-specific treatment effect estimation in the second stage, by utilizing Bayesian posterior distributions from the first stage. We evaluate our method through simulations, demonstrating that the proposed Bayesian two-stage model produces better calibrated confidence intervals than naïve approaches. We apply our method to an international COVID-19 treatment trial, which shows substantial variation in treatment effects across data-driven subgroups.

Paper Structure

This paper contains 21 sections, 27 equations, 7 figures, 13 tables.

Figures (7)

  • Figure 1: Diagram of a two-stage likely responder framework design. In Stage 1, LR and UR represent likely responder and unlikely responder subgroups, respectively. In Stage 2, $\text{ATE}_{LR}$ and $\text{ATE}_{UR}$ represent the Average Treatment Effect in the LR and UR subgroups. minCond is set pre-trial, corresponding to a clinically meaningful threshold to define LR vs. UR.
  • Figure 2: Scatterplots of the feature values on the x-axis versus the corresponding SHAP values on the y-axis. The dashed horizontal line represents a SHAP value of zero, where points above the line indicate a positive contribution to the binary outcome of ventilation or death at day 14, while points below indicate a negative contribution. The range of SHAP values varies across the plots. Features in the first row, including baseline WHO score and age, exhibit the largest range (-0.2, 0.3). Features in the second row, including two clinical features (blood type, days since symptom onset) and the two confounders (RCT ID and enrollment quarter), have a smaller range (-0.15, 0.15). Features in the third row, including sex and preexisting conditions (cardiovascular disease, pulmonary disease, and diabetes), have the smallest range (-0.02, 0.02). More important predictors show greater variability along the y-axis.
  • Figure S1: Variable inclusion proportions. The dots mark the posterior means; the vertical error bars mark the 95% credible intervals.
  • Figure S2: Diagnostic comparison of the estimated prognostic score $\hat{s}(x)$ between treatment arms. (a) Kernel density estimates show substantial overlap of $\hat{s}(x)$ across treated and control groups. (b) Empirical cumulative distribution functions (ECDFs) likewise indicate good overlap. These diagnostics support the covariate-overlap assumption underlying the PBS calculation.
  • Figure S3: Posterior predictive check for treated patients: comparison of observed event rate (vertical line) with posterior predictive distribution of replicated event rates.
  • ...and 2 more figures