Table of Contents
Fetching ...

A Bayesian Approach for Selecting Relevant External Data (BASE): Application to a study of Long-Term Outcomes in a Hemophilia Gene Therapy Trial

Tianyu Pan, Yiyao Shi, Xiang Zhang, Weining Shen, Ting Ye

TL;DR

The paper introduces BASE, a Bayesian data-selection framework that selectively borrows external longitudinal data by evaluating external subsets via marginal likelihood to improve long-term inference of gene-therapy outcomes with censoring. It models endogenous factor IX trajectories using Concatenated Cubic Hermite splines (CCHs) to capture post-treatment dynamics and plateaus, integrating internal trial data with a carefully chosen external subset. The authors establish theoretical consistency results, demonstrate performance through extensive simulations, and apply BASE to HOPE-B data, showing that Etranacogene dezaparvovec can sustain FIX production with reduced prediction uncertainty. The approach is generalizable to other longitudinal trials with censoring and offers interpretable borrowing decisions by identifying external data generating mechanisms similar to the internal data.

Abstract

Gene therapies aim to address the root causes of diseases, particularly those stemming from rare genetic defects that can be life-threatening or severely debilitating. Although an increasing number of gene therapies have received regulatory approvals in recent years, understanding their long-term efficacy in trials with limited follow-up time remains challenging. To address this critical question, we propose a novel Bayesian framework designed to selectively integrate relevant external data with internal trial data to improve the inference of the durability of long-term efficacy. We proved that the proposed method has desired theoretical properties, such as identifying and favoring external subsets deemed relevant, where the relevance is defined as the similarity, induced by the marginal likelihood, between the generating mechanisms of the internal data and the selected external data. We also conducted comprehensive simulations to evaluate its performance under various scenarios. Furthermore, we apply this method to predict and infer the endogenous factor IX (FIX) levels of patients who receive Etranacogene dezaparvovec over the long-term. Our estimated long-term FIX levels, validated by recent trial data, indicate that Etranacogene dezaparvovec induces sustained FIX production. Together, the theoretical findings, simulation results, and successful application of this framework underscore its potential to address similar long-term effectiveness estimation and inference questions in real world applications.

A Bayesian Approach for Selecting Relevant External Data (BASE): Application to a study of Long-Term Outcomes in a Hemophilia Gene Therapy Trial

TL;DR

The paper introduces BASE, a Bayesian data-selection framework that selectively borrows external longitudinal data by evaluating external subsets via marginal likelihood to improve long-term inference of gene-therapy outcomes with censoring. It models endogenous factor IX trajectories using Concatenated Cubic Hermite splines (CCHs) to capture post-treatment dynamics and plateaus, integrating internal trial data with a carefully chosen external subset. The authors establish theoretical consistency results, demonstrate performance through extensive simulations, and apply BASE to HOPE-B data, showing that Etranacogene dezaparvovec can sustain FIX production with reduced prediction uncertainty. The approach is generalizable to other longitudinal trials with censoring and offers interpretable borrowing decisions by identifying external data generating mechanisms similar to the internal data.

Abstract

Gene therapies aim to address the root causes of diseases, particularly those stemming from rare genetic defects that can be life-threatening or severely debilitating. Although an increasing number of gene therapies have received regulatory approvals in recent years, understanding their long-term efficacy in trials with limited follow-up time remains challenging. To address this critical question, we propose a novel Bayesian framework designed to selectively integrate relevant external data with internal trial data to improve the inference of the durability of long-term efficacy. We proved that the proposed method has desired theoretical properties, such as identifying and favoring external subsets deemed relevant, where the relevance is defined as the similarity, induced by the marginal likelihood, between the generating mechanisms of the internal data and the selected external data. We also conducted comprehensive simulations to evaluate its performance under various scenarios. Furthermore, we apply this method to predict and infer the endogenous factor IX (FIX) levels of patients who receive Etranacogene dezaparvovec over the long-term. Our estimated long-term FIX levels, validated by recent trial data, indicate that Etranacogene dezaparvovec induces sustained FIX production. Together, the theoretical findings, simulation results, and successful application of this framework underscore its potential to address similar long-term effectiveness estimation and inference questions in real world applications.
Paper Structure (33 sections, 6 theorems, 59 equations, 10 figures, 18 tables)

This paper contains 33 sections, 6 theorems, 59 equations, 10 figures, 18 tables.

Key Result

Theorem 2.1

Suppose external and internal trajectories are generated following iidDist and Assumptions (A1)-(A4) detailed in the Supplementary Materials hold. If $\|p_{\theta_0}-p_{\theta_1}\|_1\lesssim \epsilon_{N_0^*}$, the expected Bayes factor can be controlled as follows, On the other hand, if $\|p_{\theta_0}-p_{\theta_1}\|_1\gtrsim \epsilon_{N_0^*}\times \psi_{N_0^*}$ instead, for $\psi_{N_0^*}$ diverg

Figures (10)

  • Figure 1: FIX expression over time in clinical trials. The index date refers to the date when patients received Etranacogene dezaparvovec.
  • Figure 2: The blue dashed lines represent the derivatives at specific time points, while the black solid line illustrates the mean value. The turning point is indicated by the red dashed line.
  • Figure 3: Simulated sample data under DGP 1. The trajectories are generated with the real-data censoring patterns.
  • Figure 4: The trajectories of z-transformed factor level up to the 2nd year after index date.
  • Figure 5: The spaghetti plot of outcomes from internal study and the estimated trend of outcome using BASE. The green dashed curves represent the $95\%$ credible interval.
  • ...and 5 more figures

Theorems & Definitions (10)

  • Theorem 2.1
  • Theorem 2.2
  • Theorem 7.1
  • proof
  • Lemma 7.2
  • proof
  • Lemma 7.3
  • proof
  • Theorem 7.4
  • proof