Table of Contents
Fetching ...

Doubly robust estimation with functional outcomes missing at random

Xijia Liu, Kreske Felix Ecker, Lina Schelin, Xavier de Luna

Abstract

We present and study semi-parametric estimators for the mean of functional outcomes in situations where some of these outcomes are missing and covariate information is available on all units. Assuming that the missingness mechanism depends only on the covariates (missing at random assumption), we present two estimators for the functional mean parameter, using working models for the functional outcome given the covariates, and the probability of missingness given the covariates. We contribute by establishing that both these estimators have Gaussian processes as limiting distributions and explicitly give their covariance functions. One of the estimators is double robust in the sense that the limiting distribution holds whenever at least one of the nuisance models is correctly specified. These results allow us to present simultaneous confidence bands for the mean function with asymptotically guaranteed coverage. A Monte Carlo study shows the finite sample properties of the proposed functional estimators and their associated simultaneous inference. The use of the method is illustrated in an application where the mean of counterfactual outcomes is targeted.

Doubly robust estimation with functional outcomes missing at random

Abstract

We present and study semi-parametric estimators for the mean of functional outcomes in situations where some of these outcomes are missing and covariate information is available on all units. Assuming that the missingness mechanism depends only on the covariates (missing at random assumption), we present two estimators for the functional mean parameter, using working models for the functional outcome given the covariates, and the probability of missingness given the covariates. We contribute by establishing that both these estimators have Gaussian processes as limiting distributions and explicitly give their covariance functions. One of the estimators is double robust in the sense that the limiting distribution holds whenever at least one of the nuisance models is correctly specified. These results allow us to present simultaneous confidence bands for the mean function with asymptotically guaranteed coverage. A Monte Carlo study shows the finite sample properties of the proposed functional estimators and their associated simultaneous inference. The use of the method is illustrated in an application where the mean of counterfactual outcomes is targeted.

Paper Structure

This paper contains 15 sections, 6 theorems, 51 equations, 9 figures, 2 tables.

Key Result

Lemma 1

With some probability space $\left( \Omega, \mathcal{F}, \mathcal{P} \right)$, let $\left\{ X_n \right\} _{n \geq 1}$ be a sequence of independent random elements in Hilbert space $\left( \mathcal{H}, \mathcal{B}_{\mathcal{H}} \right)$ with mean $0$ and $\mathbb{E} \left( \left\| X_i \right\| ^2 \r

Figures (9)

  • Figure 1: Observed (solid grey curves) and "assumed missing" (dotted grey curves) outcomes for 30 generated individuals. Mean function of both observed and missing outcomes (dashed red line) and the naive mean estimate based on observed data (solid red line). Left: Data generated using Gaussian process errors; Right: Data generated using multivariate t-distributed errors.
  • Figure 2: Solid line: $\hat{\mu}_{DR}-\widehat{E}({\cal Y} \mid Z=1)$; and shaded grey: simultaneous 95% confidence bands.
  • Figure 3: Average SCB (solid) and PCB (dashed) from 1000 simulation replicates at n = 250 (black) and n = 3000 (green). Based on MVN errors for the OR model with no model misspecification.
  • Figure 4: Bias for the Monte Carlo simulations using Gaussian error terms for the OR estimator (top and bottom left panels) and DR estimator (middle and right panels); OR and PS models are outcome and propensity score models. By comparison, the bias of the complete case estimate varies between 0.21 and 1.39.
  • Figure 5: Mean estimated variances (solid lines) and Monte Carlo (MC) variance (dashed lines) for the Monte Carlo simulations using Gaussian error terms. Results for the OR estimator (top and bottom left panels) and DR estimator (middle and right panels); OR and PS models are outcome and propensity score models. By comparison, the results for the complete case estimate are at similar levels.
  • ...and 4 more figures

Theorems & Definitions (12)

  • Lemma 1
  • Lemma 2
  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Lemma 3
  • proof
  • Lemma 4
  • proof
  • ...and 2 more