Table of Contents
Fetching ...

Semiparametric Efficient Empirical Higher Order Influence Function Estimators

Lin Liu, Rajarshi Mukherjee, Whitney K. Newey, James M. Robins

TL;DR

This work introduces empirical Higher Order Influence Function (HOIF) estimators for semiparametric efficient estimation of functionals such as the mean under MAR, notably removing the need to nonparametrically estimate the covariate density $g$. By substituting the population Gram matrix with its empirical counterpart, the estimator achieves $\sqrt{n}$-consistency and efficiency under minimal Hölder smoothness, without requiring $g$ to be smooth. The approach extends to the full class of doubly robust functionals and adapts to unknown smoothness levels, yielding adaptive efficiency via basis choices with optimal approximation properties. Simulations show finite-sample gains when $g$ is rough, with reductions in bias and faster computation compared to density-based HOIFs. Overall, the paper fills a theoretical gap by delivering density-free, adaptive, and efficient HOIF estimators for a broad set of causal functionals under minimal assumptions.

Abstract

Robins et al. (2008, 2017) applied the theory of higher order influence functions (HOIFs) to derive an estimator of the mean $ψ$ of an outcome Y in a missing data model with Y missing at random conditional on a vector X of continuous covariates; their estimator, in contrast to previous estimators, is semiparametric efficient under the minimal conditions of Robins et al. (2009b), together with an additional (non-minimal) smoothness condition on the density g of X, because the Robins et al. (2008, 2017) estimator depends on a nonparametric estimate of g. In this paper, we introduce a new HOIF estimator that has the same asymptotic properties as the original one, but does not impose any smoothness requirement on g. This is important for two reasons. First, one rarely has the knowledge about the properties of g. Second, even when g is smooth, if the dimension of X is even moderate, accurate nonparametric estimation of its density is not feasible at the sample sizes often encountered in applications. In fact, to the best of our knowledge, this new HOIF estimator remains the only semiparametric efficient estimator of $ψ$ under minimal conditions, despite the rapidly growing literature on causal effect estimation. We also show that our estimator can be generalized to the entire class of functionals considered by Robins et al. (2008) which include the average effect of a treatment on a response Y when a vector X suffices to control confounding and the expected conditional variance of a response Y given a vector X. Simulation experiments are also conducted, which demonstrate that our new estimator outperforms those of Robins et al. (2008, 2017) in finite samples, when g is not very smooth.

Semiparametric Efficient Empirical Higher Order Influence Function Estimators

TL;DR

This work introduces empirical Higher Order Influence Function (HOIF) estimators for semiparametric efficient estimation of functionals such as the mean under MAR, notably removing the need to nonparametrically estimate the covariate density . By substituting the population Gram matrix with its empirical counterpart, the estimator achieves -consistency and efficiency under minimal Hölder smoothness, without requiring to be smooth. The approach extends to the full class of doubly robust functionals and adapts to unknown smoothness levels, yielding adaptive efficiency via basis choices with optimal approximation properties. Simulations show finite-sample gains when is rough, with reductions in bias and faster computation compared to density-based HOIFs. Overall, the paper fills a theoretical gap by delivering density-free, adaptive, and efficient HOIF estimators for a broad set of causal functionals under minimal assumptions.

Abstract

Robins et al. (2008, 2017) applied the theory of higher order influence functions (HOIFs) to derive an estimator of the mean of an outcome Y in a missing data model with Y missing at random conditional on a vector X of continuous covariates; their estimator, in contrast to previous estimators, is semiparametric efficient under the minimal conditions of Robins et al. (2009b), together with an additional (non-minimal) smoothness condition on the density g of X, because the Robins et al. (2008, 2017) estimator depends on a nonparametric estimate of g. In this paper, we introduce a new HOIF estimator that has the same asymptotic properties as the original one, but does not impose any smoothness requirement on g. This is important for two reasons. First, one rarely has the knowledge about the properties of g. Second, even when g is smooth, if the dimension of X is even moderate, accurate nonparametric estimation of its density is not feasible at the sample sizes often encountered in applications. In fact, to the best of our knowledge, this new HOIF estimator remains the only semiparametric efficient estimator of under minimal conditions, despite the rapidly growing literature on causal effect estimation. We also show that our estimator can be generalized to the entire class of functionals considered by Robins et al. (2008) which include the average effect of a treatment on a response Y when a vector X suffices to control confounding and the expected conditional variance of a response Y given a vector X. Simulation experiments are also conducted, which demonstrate that our new estimator outperforms those of Robins et al. (2008, 2017) in finite samples, when g is not very smooth.

Paper Structure

This paper contains 27 sections, 17 theorems, 115 equations, 3 figures, 10 tables.

Key Result

Proposition 1

For any invertible $\widehat{\Omega}$ one has conditional on the training sample, where with $K_{g, k} (x', x) = \bar{z}_{k}^{\top} (x') \Omega^{-1} \bar{z}_{k} (x)$ the orthogonal projection kernel onto $\bar{z}_{k} (x)$ in $L_{2} (g)$, $\mathsf{\Pi}_{g, \bar{z}_{k}} [h] (x) = \int \mathrm{d} x' g (x') h (x') K_{g, k} (x, x')$ the corresponding orthogonal projection of any function $x

Figures (3)

  • Figure 1: Results of simulation experiment. The upper panels compare $\widehat{\mathbb{IF}}_{2, 2, k} (\Omega)$, $\widehat{\mathbb{IF}}_{2, 2, k} (\widehat{\Omega}^{\rm emp})$, $\widehat{\mathbb{IF}}_{2, 2, k} (\widehat{\Omega}^{\rm ac})$ and $\widehat{\mathbb{IF}}_{2, 2, k} (\widehat{g})$. Color code: black--$\widehat{{ \if@compatibility \mathchar"0120 {} \mathchar"0120 }}_{1} - { \if@compatibility \mathchar"0120 {} \mathchar"0120 } ({ \if@compatibility \mathchar"0112 {} \mathchar"0112 })$; grey--$\widehat{\mathbb{IF}}_{2, 2, k}(\Omega)$; blue--$\widehat{\mathbb{IF}}_{2, 2, k}(\widehat{\Omega}^{\rm emp})$; purple--$\widehat{\mathbb{IF}}_{2, 2, k} (\widehat{\Omega}^{\rm ac})$; green--$\widehat{\mathbb{IF}}_{2, 2, k} (\widehat{g})$. The lower panels compare the estimators before and after being corrected by different versions of $\widehat{\mathbb{IF}}_{2, 2, k}$, i.e. $\widehat{{ \if@compatibility \mathchar"0120 {} \mathchar"0120 }}_{2, k}(\Omega) = \widehat{{ \if@compatibility \mathchar"0120 {} \mathchar"0120 }}_{1} + \widehat{\mathbb{IF}}_{2, 2, k} (\Omega)$, $\widehat{{ \if@compatibility \mathchar"0120 {} \mathchar"0120 }}_{2, k}^{\rm emp} = \widehat{{ \if@compatibility \mathchar"0120 {} \mathchar"0120 }}_{1} + \widehat{\mathbb{IF}}_{2, 2, k} (\widehat{\Omega}^{\rm emp})$, $\widehat{{ \if@compatibility \mathchar"0120 {} \mathchar"0120 }}_{2, k}^{\rm ac} = \widehat{{ \if@compatibility \mathchar"0120 {} \mathchar"0120 }}_{1} + \widehat{\mathbb{IF}}_{2, 2, k} (\widehat{\Omega}^{\rm ac})$, and $\widehat{{ \if@compatibility \mathchar"0120 {} \mathchar"0120 }}_{2, k}(\widehat{g}) = \widehat{{ \if@compatibility \mathchar"0120 {} \mathchar"0120 }}_{1} + \widehat{\mathbb{IF}}_{2, 2, k} (\widehat{g})$. Color code: black--$\widehat{{ \if@compatibility \mathchar"0120 {} \mathchar"0120 }}_{1} - { \if@compatibility \mathchar"0120 {} \mathchar"0120 } ({ \if@compatibility \mathchar"0112 {} \mathchar"0112 })$; grey--$\widehat{{ \if@compatibility \mathchar"0120 {} \mathchar"0120 }}_{2, k}(\Omega) - { \if@compatibility \mathchar"0120 {} \mathchar"0120 } ({ \if@compatibility \mathchar"0112 {} \mathchar"0112 })$; blue--$\widehat{{ \if@compatibility \mathchar"0120 {} \mathchar"0120 }}_{2, k}^{\rm emp} - { \if@compatibility \mathchar"0120 {} \mathchar"0120 } ({ \if@compatibility \mathchar"0112 {} \mathchar"0112 })$; purple--$\widehat{{ \if@compatibility \mathchar"0120 {} \mathchar"0120 }}_{2, k}^{\rm ac} - { \if@compatibility \mathchar"0120 {} \mathchar"0120 } ({ \if@compatibility \mathchar"0112 {} \mathchar"0112 })$; green--$\widehat{{ \if@compatibility \mathchar"0120 {} \mathchar"0120 }}_{2, k}(\widehat{g}) - { \if@compatibility \mathchar"0120 {} \mathchar"0120 } ({ \if@compatibility \mathchar"0112 {} \mathchar"0112 })$. In the panels on the left, the nuisance functions $b$ and $1 / p$ are estimated by GLMs whereas in panels on the right, they are estimated by GAMs. The dots in each plot are the Monte Carlo averages across 100 simulated datasets. The error bars in each plot correspond to the 10% and 90% percentiles out of 100 Monte Carlo simulations. Within each column of any panel, from left to right we display the simulation results for estimation sample sizes $n = 25000, 100000, 200000$.
  • Figure 2: Results of simulation experiment. The color codes are the same as in Figure \ref{['fig:1']}, except that the simulations for $\widehat{\mathbb{IF}}_{2, 2, k} (\widehat{g})$ are removed from the upper panels and the simulations for $\widehat{{ \if@compatibility \mathchar"0120 {} \mathchar"0120 }}_{1} - { \if@compatibility \mathchar"0120 {} \mathchar"0120 } ({ \if@compatibility \mathchar"0112 {} \mathchar"0112 })$ and $\widehat{{ \if@compatibility \mathchar"0120 {} \mathchar"0120 }}_{2, k}(\widehat{g}) - { \if@compatibility \mathchar"0120 {} \mathchar"0120 } ({ \if@compatibility \mathchar"0112 {} \mathchar"0112 })$ are removed from the lower panels. Within each column of any panel, from left to right we display the simulation results for estimation sample sizes $n = 25000, 100000, 200000$.
  • Figure 3: Results of simulation experiment. The color codes are the same as in Figure \ref{['fig:1']}, except that only $\widehat{\mathbb{IF}}_{2, 2, k} (\widehat{\Omega}^{\rm emp}) - \widehat{\mathbb{IF}}_{2, 2, k} (\Omega)$ and $\widehat{\mathbb{IF}}_{2, 2, k} (\widehat{\Omega}^{\rm ac}) - \widehat{\mathbb{IF}}_{2, 2, k} (\Omega)$ are displayed to highlight the observation that $\widehat{\mathbb{IF}}_{2, 2, k} (\widehat{\Omega}^{\rm emp})$ is closer to the oracle $\widehat{\mathbb{IF}}_{2, 2, k} (\Omega)$ than $\widehat{\mathbb{IF}}_{2, 2, k} (\widehat{\Omega}^{\rm ac})$. Within each column of any panel, from left to right we display the simulation results for estimation sample sizes $n = 25000, 100000, 200000$.

Theorems & Definitions (38)

  • Remark 1
  • Proposition 1
  • Remark 2
  • Proposition 2
  • Remark 3
  • Theorem 3
  • Corollary 4
  • Remark 4
  • Definition 1
  • Theorem 5
  • ...and 28 more