Double Cross-fit Doubly Robust Estimators: Beyond Series Regression
Alec McClean, Sivaraman Balakrishnan, Edward H. Kennedy, Larry Wasserman
TL;DR
The paper develops a rigorous theory for double cross-fit doubly robust (DCDR) estimators aimed at the ECC functional $\\psi_{ecc} = \\mathbb{E}\\{\\text{cov}(A,Y|X)\\}$, providing a structure-agnostic expansion and a spectral radius control that underpin fast rates under minimal assumptions. When nuisance components $\\pi$ and $\\mu$ possess Hölder smoothness, DCDR with linear smoothers such as local polynomial regression or k-NN attains $\\sqrt{n}$-consistency and asymptotic normality under mild conditions, with non-$\\sqrt{n}$ minimax rates $n^{-(\\alpha+\\beta)/d}$ or $n^{-(2\\alpha+2\\beta)/(2\\alpha+2\\beta+d)}$ depending on covariate-density knowledge. If the covariate density is known and smooth, covariate-density-adapted kernel regression yields minimax-optimal performance and can even exhibit a slower-than-$\\sqrt{n}$ central limit theorem, enabling valid inference in regimes where standard $\\sqrt{n}$ methods fail. Simulations corroborate the theory, illustrating when double cross-fitting and undersmoothing provide substantial improvements over the standard SCDR-MSE approach and demonstrating asymptotic normality for undersmoothed DCDR in the non-$\\sqrt{n}$ regime. The results offer practical guidance for constructing efficient, inference-ready estimators for causal functionals in settings with smooth nuisance components, and point to broader applicability to other mixed-bias functionals.
Abstract
Doubly robust estimators with cross-fitting have gained popularity in causal inference due to their favorable structure-agnostic error guarantees. However, when additional structure, such as Hölder smoothness, is available then more accurate "double cross-fit doubly robust" (DCDR) estimators can be constructed by splitting the training data and undersmoothing nuisance function estimators on independent samples. We study a DCDR estimator of the Expected Conditional Covariance, a functional of interest in causal inference and conditional independence testing. We first provide a structure-agnostic error analysis for the DCDR estimator with no assumptions on the nuisance functions or their estimators. Then, assuming the nuisance functions are Hölder smooth, but without assuming knowledge of the true smoothness level or the covariate density, we establish that DCDR estimators with several linear smoothers are $\sqrt{n}$-consistent and asymptotically normal under minimal conditions and achieve fast convergence rates in the non-$\sqrt{n}$ regime. When the covariate density and smoothnesses are known, we propose a minimax rate-optimal DCDR estimator based on undersmoothed kernel regression. Moreover, we show an undersmoothed DCDR estimator satisfies a slower-than-$\sqrt{n}$ central limit theorem, and that inference is possible even in the non-$\sqrt{n}$ regime. Finally, we support our theoretical results with simulations, providing intuition for double cross-fitting and undersmoothing, demonstrating where our estimator achieves $\sqrt{n}$-consistency while the usual "single cross-fit" estimator fails, and illustrating asymptotic normality for the undersmoothed DCDR estimator.
