Proximal Learning for Trials With External Controls: A Case Study in HIV Prevention

Yilin Song; Yinxiang Wu; Raphael J. Landovitz; Susan Buchbinder; Srilatha Edupuganti; Lydia Soto-Torres; Kendrick Li; Xu Shi; Fei Gao; Deborah Donnell; Holly Janes; Ting Ye

Proximal Learning for Trials With External Controls: A Case Study in HIV Prevention

Yilin Song, Yinxiang Wu, Raphael J. Landovitz, Susan Buchbinder, Srilatha Edupuganti, Lydia Soto-Torres, Kendrick Li, Xu Shi, Fei Gao, Deborah Donnell, Holly Janes, Ting Ye

TL;DR

A novel application of proximal causal inference methods to estimate the counterfactual cumulative HIV incidence under placebo for participants in an active-controlled trial of cabotegravir, using external control data from a placebo-controlled trial with similar eligibility criteria.

Abstract

With the advent of effective pre-exposure prophylaxis agents, active-controlled HIV prevention trials have become a common study design. Nevertheless, estimating absolute efficacy relative to a placebo remains important. In this paper, we introduce a novel application of proximal causal inference methods to estimate the counterfactual cumulative HIV incidence under placebo for participants in an active-controlled trial of cabotegravir, using external control data from a placebo-controlled trial with similar eligibility criteria. We leverage baseline sexually transmitted infection status and geographic region as negative control outcome and exposure variables, respectively. We address two key challenges: unmeasured differences in HIV risk between trials and statistical difficulties arising from low HIV incidence rates in both studies. To overcome these challenges, we develop two proximal inference approaches: (1) a semiparametric inverse probability of censoring weighting estimator, and (2) a two-stage regression-based strategy tailored to low-event-rate settings. Our theoretical and numerical investigations demonstrate these methods yield reliable estimates of the counterfactual one-year cumulative HIV incidence under placebo, and provide robust evidence of the superior efficacy of cabotegravir compared with placebo. These findings highlight the potential of proximal inference methods to estimate placebo-controlled effects in both single-arm and active-controlled trials by leveraging external controls.

Proximal Learning for Trials With External Controls: A Case Study in HIV Prevention

TL;DR

Abstract

Paper Structure (28 sections, 2 theorems, 85 equations, 7 figures, 12 tables)

This paper contains 28 sections, 2 theorems, 85 equations, 7 figures, 12 tables.

HIV Prevention Trials
HPTN 083 as the primary trial of interest
Estimating counterfactual HIV incidence under placebo
HVTN704/HPTN 085 as an external control dataset
Gaps in existing statistical methods for external controls
Review of proximal inference methods
Method development and HPTN 083 application
Methods
Notation and setup
A proximal causal inference approach
Method 1: Semiparametric IPCW estimator
Method 2: Regression-based two-stage estimator
Case Study: Application to HIV Prevention Trials
Estimating counterfactual HIV incidence under placebo
Statistical inference on absolute efficacy
...and 13 more sections

Key Result

Theorem 1

Suppose Assumptions 1$^*$, assump: positivity-assump: censoring, and that there exist square-integrable functions $h$ and $q$ that satisfy the following integral equations almost surely: Then, $P(T^*(0)\leq t\mid R=0)$ is nonparametrically identified and can be expressed in either of the following two forms: Furthermore, $P(T^*(0)\leq t\mid R=0)$ can be identified through the following augmented

Figures (7)

Figure 1: Data processing workflow. We first excluded observations with missing key covariates, including gender, race, and age. We then constructed three separate analytic datasets by excluding individuals with missing baseline gonorrhea, missing baseline chlamydia, or missing values for either gonorrhea or chlamydia, respectively. Baseline STI measures were used as negative control outcomes.
Figure 2: Causal DAG depicting the assumed causal relationships underpinning Assumption \ref{['assump: negative controls']}. The unmeasured confounder $U$, such as local HIV transmission environment risk, and observed covariates $\bm{X}$ jointly affect the potential HIV outcome under placebo $T^*(0)$, the negative control outcome $W$ (e.g., STI status), and the study indicator $R$. The coarse regional variable $Z$ (e.g., Latin America vs. non-Latin America) serves as a proxy for $U$ through its dependence on local region. Under this structure, $Z$ and $W$ satisfy the conditions for valid negative control exposure and outcome, respectively.
Figure 3: Estimated counterfactual one-year HIV cumulative incidence under placebo (in %) for HPTN 083 using different methods and either baseline gonorrhea or chlamydia infection as the NCO. The red dashed line indicates the observed one-year cumulative incidence in the cabotegravir arm (0.41%), and the blue dashed line indicates the observed one-year cumulative incidence in the TDF/FTC arm (1.22%).
Figure S1: Estimated counterfactual HIV cumulative incidence through 1 year (%) for HPTN 083 using different methods and NCOs. All arms in the AMP data were combined as the external control dataset. The red dashed line indicates the observed incidence rate in the Cabotegravir arm (0.41 per 100 person-years), and the blue dashed line indicates the observed incidence rate in the TDF/FTC arm (1.22 per 100 person-years). Age, race and gender were adjusted in all models. All arms in the AMP data were combined as the external data. The x-axis is plotted with the $\log()$ transformation for better visualization.
Figure S2: Estimated counterfactual HIV cumulative incidence through 1 year (%) for HPTN 083 using different methods and rectal Gonorrhea and Chlamydia as the NCO. The red dashed line indicates the observed incidence rate in the Cabotegravir arm (0.41 per 100 person-years), and the blue dashed line indicates the observed incidence rate in the TDF/FTC arm (1.22 per 100 person-years). Age, race and gender were adjusted in all models. Only the placebo arm in the AMP data was used as the external data. The x-axis is plotted with the $\log()$ transformation for better visualization.
...and 2 more figures

Theorems & Definitions (4)

Theorem 1: Semiparametric IPCW identification
Theorem 2
proof
proof

Proximal Learning for Trials With External Controls: A Case Study in HIV Prevention

TL;DR

Abstract

Proximal Learning for Trials With External Controls: A Case Study in HIV Prevention

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (4)