A Bayesian Approach for Selecting Relevant External Data (BASE): Application to a study of Long-Term Outcomes in a Hemophilia Gene Therapy Trial
Tianyu Pan, Yiyao Shi, Xiang Zhang, Weining Shen, Ting Ye
TL;DR
The paper introduces BASE, a Bayesian data-selection framework that selectively borrows external longitudinal data by evaluating external subsets via marginal likelihood to improve long-term inference of gene-therapy outcomes with censoring. It models endogenous factor IX trajectories using Concatenated Cubic Hermite splines (CCHs) to capture post-treatment dynamics and plateaus, integrating internal trial data with a carefully chosen external subset. The authors establish theoretical consistency results, demonstrate performance through extensive simulations, and apply BASE to HOPE-B data, showing that Etranacogene dezaparvovec can sustain FIX production with reduced prediction uncertainty. The approach is generalizable to other longitudinal trials with censoring and offers interpretable borrowing decisions by identifying external data generating mechanisms similar to the internal data.
Abstract
Gene therapies aim to address the root causes of diseases, particularly those stemming from rare genetic defects that can be life-threatening or severely debilitating. Although an increasing number of gene therapies have received regulatory approvals in recent years, understanding their long-term efficacy in trials with limited follow-up time remains challenging. To address this critical question, we propose a novel Bayesian framework designed to selectively integrate relevant external data with internal trial data to improve the inference of the durability of long-term efficacy. We proved that the proposed method has desired theoretical properties, such as identifying and favoring external subsets deemed relevant, where the relevance is defined as the similarity, induced by the marginal likelihood, between the generating mechanisms of the internal data and the selected external data. We also conducted comprehensive simulations to evaluate its performance under various scenarios. Furthermore, we apply this method to predict and infer the endogenous factor IX (FIX) levels of patients who receive Etranacogene dezaparvovec over the long-term. Our estimated long-term FIX levels, validated by recent trial data, indicate that Etranacogene dezaparvovec induces sustained FIX production. Together, the theoretical findings, simulation results, and successful application of this framework underscore its potential to address similar long-term effectiveness estimation and inference questions in real world applications.
