Identification and estimation of the conditional average treatment effect with nonignorable missing covariates, treatment, and outcome

Shuozhi Zuo; Yixin Wang; Fan Yang

Identification and estimation of the conditional average treatment effect with nonignorable missing covariates, treatment, and outcome

Shuozhi Zuo, Yixin Wang, Fan Yang

Abstract

Treatment effect heterogeneity is central to policy evaluation, social science, and precision medicine, where interventions can affect individuals differently. In observational studies, covariates, treatment, and outcomes are often only partially observed. When missingness depends on unobserved values (missing not at random; MNAR), standard methods can yield biased estimates of the conditional average treatment effect (CATE). This paper establishes nonparametric identification of the CATE under multivariate MNAR mechanisms that allow covariates, treatment, and outcomes to be MNAR. It also develops nonparametric and parametric estimators and proposes a sensitivity analysis framework for assessing robustness to violations of the missingness assumptions.

Identification and estimation of the conditional average treatment effect with nonignorable missing covariates, treatment, and outcome

Abstract

Paper Structure (38 sections, 5 theorems, 74 equations, 7 figures, 1 table)

This paper contains 38 sections, 5 theorems, 74 equations, 7 figures, 1 table.

Introduction
Related work
Our contributions
Organization of the paper
Notation, definition, and causal assumptions
Missingness mechanisms and identification
Summary.
Estimation
Nonparametric series 2SLS estimator under Theorem \ref{['the2']} or \ref{['the3']}
Parametric estimation under Theorem \ref{['the2']} or \ref{['the3']}
Simulation study
Setups and estimators
Results
Application to the NJCS and sensitivity analysis
Data and discussion on assumptions
...and 23 more sections

Key Result

Theorem 1

Under the causal assumptions (i) to (iii) and Assumption ass1, if $\mathbb{P}(R^Y=1,R^T=1,R^X=1\mid X=x,T=t)>0$ for all $x,t$, then $\mathbb{P}(Y\mid T,X)$ is identifiable, and therefore, $\tau_{t_1,t_0}(x)$ is identifiable.

Figures (7)

Figure 1: Each MNAR assumption restricts exactly one arrow into $R^Y$ relative to the general missingness DAG. Top row: general missingness, MCAR, and MAR. Bottom row: MNAR Assumptions \ref{['ass1']}--\ref{['ass3']}.
Figure 2: CCA is unbiased only under Assumption \ref{['ass1']}; under Assumptions \ref{['ass2']}--\ref{['ass3']}, bias can be substantial. Boxplots show percent bias in $\tau_{1,0}(1)$ (binary $X$) and $\tau_{1,0}(0)$ (continuous $X$) with binary $Y$; closer to zero is better. Rows correspond to $(X,T)$ type combinations and columns to Assumptions \ref{['ass1']}--\ref{['ass3']}. Methods: Oracle, CCA, $X$-miss-indicator + CCA, MI (restricted), MI (all), NP, Para, and Para (full).
Figure 3: Under Assumptions \ref{['ass2']}--\ref{['ass3']}, estimators that ignore outcome-dependent missingness can be biased. Boxplots show percent bias in $\tau_{1,0}(1)$ (binary $X$) and $\tau_{1,0}(0)$ (continuous $X$) with continuous $Y$; closer to zero is better. Rows correspond to $(X,T)$ type combinations and columns to Assumptions \ref{['ass1']}--\ref{['ass3']}. Methods: Oracle, CCA, $X$-miss-indicator + CCA, MI (restricted), MI (all), NP, Para, and Para (full).
Figure 4: Credential attainment increases earnings for the reference profile: all 95% bootstrap percentile CIs for $\tau_{1,0}(x_\text{ref})$ exclude zero. Shown are CCA, $X$-miss-indicator + CCA, MI (restricted), MI (all), NP, and Para; NP and Para are reported under Assumptions \ref{['ass1']}--\ref{['ass3']}. Larger values indicate larger earnings gains.
Figure 5: Sensitivity analysis adds one excluded edge into $R^Y$ to quantify departures from each baseline MNAR assumption. Each dotted red arrow indicates the violation introduced by offset $\delta$. At $\delta=0$, the baseline missingness model holds. In the Assumption \ref{['ass3']} panel, $X^{\mathrm{id}}$ denotes race (the identifying covariate component); the remaining baseline covariates $X^c$, which may influence any other variables in the DAG, are omitted for readability.
...and 2 more figures

Theorems & Definitions (6)

Theorem 1
Theorem 2
Theorem 3
Remark 1
Proposition 1: (example for Theorem \ref{['the2']})
Proposition 2: (example for Theorem \ref{['the3']})

Identification and estimation of the conditional average treatment effect with nonignorable missing covariates, treatment, and outcome

Abstract

Identification and estimation of the conditional average treatment effect with nonignorable missing covariates, treatment, and outcome

Authors

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (6)