On the Role of Surrogates in Conformal Inference of Individual Causal Effects
Chenyin Gao, Peter B. Gilbert, Larry Han
TL;DR
This work tackles uncertainty quantification for individualized treatment effects (ITEs) using conformal inference, which historically yields overly wide prediction intervals. It introduces SCIENCE, a surrogate-assisted conformal inference framework that leverages surrogate outcomes under covariate shift and semi-supervised settings to produce more efficient, valid prediction intervals for ITEs. By deriving semi-parametric efficiency bounds via efficient influence functions and establishing PAC-type coverage, SCIENCE demonstrates measurable interval-width reductions in simulations and real data (Moderna COVE) when surrogates are predictive. The approach enables reliable, individualized uncertainty quantification in precision medicine, with practical gains realized through surrogate markers and flexible nuisance-function estimation under mild regularity conditions.
Abstract
Learning the Individual Treatment Effect (ITE) is essential for personalized decision-making, yet causal inference has traditionally focused on aggregated treatment effects. While integrating conformal prediction with causal inference can provide valid uncertainty quantification for ITEs, the resulting prediction intervals are often excessively wide, limiting their practical utility. To address this limitation, we introduce \underline{S}urrogate-assisted \underline{C}onformal \underline{I}nference for \underline{E}fficient I\underline{N}dividual \underline{C}ausal \underline{E}ffects (SCIENCE), a framework designed to construct more efficient prediction intervals for ITEs. SCIENCE accommodates the covariate shifts between source data and target data and applies to various data configurations, including semi-supervised and surrogate-assisted semi-supervised learning. Leveraging semi-parametric efficiency theory, SCIENCE produces rate double-robust prediction intervals under mild rate convergence conditions, permitting the use of flexible non-parametric models to estimate nuisance functions. We quantify efficiency gains by comparing semi-parametric efficiency bounds with and without the surrogates. Simulation studies demonstrate that our surrogate-assisted intervals offer substantial efficiency improvements over existing methods while maintaining valid group-conditional coverage. Applied to the phase 3 Moderna COVE COVID-19 vaccine trial, SCIENCE illustrates how multiple surrogate markers can be leveraged to generate more efficient prediction intervals.
