Continuous Treatment Effects with Surrogate Outcomes

Zhenghao Zeng; David Arbour; Avi Feller; Raghavendra Addanki; Ryan Rossi; Ritwik Sinha; Edward H. Kennedy

Continuous Treatment Effects with Surrogate Outcomes

Zhenghao Zeng, David Arbour, Avi Feller, Raghavendra Addanki, Ryan Rossi, Ritwik Sinha, Edward H. Kennedy

TL;DR

This paper tackles the problem of estimating continuous treatment effects when primary outcomes are partly missing by leveraging surrogate outcomes and unlabeled data in a semi-supervised, doubly robust framework. It derives an identifying characterisation and constructs a pseudo-outcome-based estimator that remain consistent if either the outcome model or the treatment-density models are correctly specified, while also achieving asymptotic normality under nonparametric smoothing. The authors prove oracle efficiency under mild bias conditions and quantify a variance reduction from incorporating surrogates and unlabeled data, supported by simulations and a Job Corps real-data application that reveal nonlinear dose-response behavior. The approach enables robust inference with flexible nuisance estimation (including machine learning) and broad applicability to dose-response estimation with missing primary outcomes. Practical impact lies in more efficient and principled use of surrogate information to recover causal dose-response relationships in settings with costly or incomplete outcomes.

Abstract

In many real-world causal inference applications, the primary outcomes (labels) are often partially missing, especially if they are expensive or difficult to collect. If the missingness depends on covariates (i.e., missingness is not completely at random), analyses based on fully observed samples alone may be biased. Incorporating surrogates, which are fully observed post-treatment variables related to the primary outcome, can improve estimation in this case. In this paper, we study the role of surrogates in estimating continuous treatment effects and propose a doubly robust method to efficiently incorporate surrogates in the analysis, which uses both labeled and unlabeled data and does not suffer from the above selection bias problem. Importantly, we establish the asymptotic normality of the proposed estimator and show possible improvements on the variance compared with methods that solely use labeled data. Extensive simulations show our methods enjoy appealing empirical performance.

Continuous Treatment Effects with Surrogate Outcomes

TL;DR

Abstract

Paper Structure (26 sections, 5 theorems, 91 equations, 7 figures, 1 algorithm)

This paper contains 26 sections, 5 theorems, 91 equations, 7 figures, 1 algorithm.

Introduction
Setup and Notation
Data Structure
Estimand and Nuisance Functions
Identification
Doubly Robust Estimation
Doubly Robust Characterization
Estimation Procedure
Theoretical Results
Oracle Estimation Theory
Asymptotic Normality
Simulation Study
Discussion
Background on Efficiency Theory
Proofs
...and 11 more sections

Key Result

Theorem 1

Under Assumption asm:consistency--asm:surrogates-positivity we have for fixed $a \in \mathcal{A}$, where the expectations are over $Y, \mathbf{S}, \mathbf{V}$ in eq:identification.

Figures (7)

Figure 1: Example of a causal graph with surrogate outcome $\mathbf{S}$.
Figure 2: Root mean square error Versus $\alpha$, where $n^{-\alpha}$ is the estimation error of the nuisance functions.
Figure 3: RMSE versus sample size (in log scale) when nuisance functions are estimated by parametric models.
Figure 4: RMSE versus sample size when nuisance functions are estimated by nonparametric models.
Figure 5: Root mean square error Versus $\alpha$, where $n^{-\alpha}$ is the estimation error of the nuisance functions.
...and 2 more figures

Theorems & Definitions (10)

Theorem 1
Proposition 1
Theorem 2
Proposition 2
Theorem 3
proof
proof
proof
proof
proof

Continuous Treatment Effects with Surrogate Outcomes

TL;DR

Abstract

Continuous Treatment Effects with Surrogate Outcomes

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (10)