Table of Contents
Fetching ...

Neumann-series corrections for regression adjustment in randomized experiments

Dogyoon Song

Abstract

We study average treatment effect (ATE) estimation under complete randomization with many covariates in a design-based, finite-population framework. In randomized experiments, regression adjustment can improve precision of estimators using covariates, without requiring a correctly specified outcome model. However, existing design-based analyses establish asymptotic normality only up to $p = o(n^{1/2})$, extendable to $p = o(n^{2/3})$ with a single de-biasing. We introduce a novel theoretical perspective on the asymptotic properties of regression adjustment through a Neumann-series decomposition, yielding a systematic higher-degree corrections and a refined analysis of regression adjustment. Specifically, for ordinary least squares regression adjustment, the Neumann expansion sharpens analysis of the remainder term, relative to the residual difference-in-means. Under mild leverage regularity, we show that the degree-$d$ Neumann-corrected estimator is asymptotically normal whenever $p^{ d+3}(\log p)^{ d+1}=o(n^{ d+2})$, strictly enlarging the admissible growth of $p$. The analysis is purely randomization-based and does not impose any parametric outcome models or super-population assumptions.

Neumann-series corrections for regression adjustment in randomized experiments

Abstract

We study average treatment effect (ATE) estimation under complete randomization with many covariates in a design-based, finite-population framework. In randomized experiments, regression adjustment can improve precision of estimators using covariates, without requiring a correctly specified outcome model. However, existing design-based analyses establish asymptotic normality only up to , extendable to with a single de-biasing. We introduce a novel theoretical perspective on the asymptotic properties of regression adjustment through a Neumann-series decomposition, yielding a systematic higher-degree corrections and a refined analysis of regression adjustment. Specifically, for ordinary least squares regression adjustment, the Neumann expansion sharpens analysis of the remainder term, relative to the residual difference-in-means. Under mild leverage regularity, we show that the degree- Neumann-corrected estimator is asymptotically normal whenever , strictly enlarging the admissible growth of . The analysis is purely randomization-based and does not impose any parametric outcome models or super-population assumptions.

Paper Structure

This paper contains 78 sections, 17 theorems, 154 equations, 4 figures, 3 algorithms.

Key Result

Proposition 1

For every $d \in \mathbb{Z}_{\geq 0}$,

Figures (4)

  • Figure 1: Comparison of $\hat{\tau}_{\mathrm{OLS}}^{[d]}$ for $d \in \{0,1,2,3\}$ against $\hat{\tau}_{\mathrm{OLS}}$ and $\hat{\tau}_{\mathrm{DiM}}$ under Gaussian random $X$ and the worst‑case residual model \ref{['item:worst']} at $n = 500$, $n_1 = 150$, $N = 2000$. The shaded bands indicate $10\%$--$90\%$ regions across the $R=50$ outer repetitions. (Left): Normalized absolute bias tends to decrease as the correction degree $d$ increases. (Right): Normalized empirical variance slightly increases as $d$ grows, but the OLS-RA and Neumann-corrected estimators all exhibit smaller variance than the difference-in-means variance baseline, as regression adjustment compensates for the part of outcomes explainable by covariates, thereby reducing the variance.
  • Figure 2: Comparison of $\hat{\tau}_{\mathrm{OLS}}^{[d]}$ for $d \in \{0,1,2,3\}$ against $\hat{\tau}_{\mathrm{OLS}}$ and $\hat{\tau}_{\mathrm{DiM}}$ under $t(2)$ random $X$ and the worst‑case residual model\ref{['item:worst']} at $n = 500$, $n_1 = 150$, $N = 2000$. The shaded bands indicate $10\%$--$90\%$ regions across the $R=50$ outer repetitions. (Left): Normalized absolute bias. (Right): Normalized empirical variance. We observe an alternating pattern: even‑degree corrections further reduce bias at the expense of higher variance (relative to $d-1$), while odd‑degree corrections tend to lower variance but can increase absolute bias.
  • Figure 3: Comparison of $\hat{\tau}_{\mathrm{OLS}}^{[d]}$ for $d \in \{0,1,2,3\}$ against $\hat{\tau}_{\mathrm{OLS}}$ and $\hat{\tau}_{\mathrm{DiM}}$ under $t(2)$ random $X$ with each covariate trimmed at its $5\%$ and $95\%$ quantiles, and the worst‑case residual model \ref{['item:worst']} at $n = 500$, $n_1 = 150$, $N = 2000$. The shaded bands indicate $10\%$--$90\%$ regions across the $R=25$ outer repetitions. (Left): Normalized absolute bias. (Right): Normalized empirical variance. With covariate trimming, even under heavy‑tailed covariates, Neumann‑corrected estimators reduce bias more effectively, without increasing variance much.
  • Figure 4: Comparison of $\hat{\tau}_{\mathrm{OLS}}$ versus $\hat{\tau}_{\mathrm{OLS}}^{[d]}$ for $d \in \{0,1,2,3\}$ under the "typical i.i.d. residuals" (cf. Section \ref{['sec:experiments']}) at $n = 500$, $n_1 = 150$, $N = 250$, $R=25$ (median with $10\%$-$90\%$ region shaded). (Left): Normalized absolute bias. Middle: Normalized empirical variance. (Right): Normalized RMSE. In this setting, corrections provide only modest improvements because the baseline bias is already small.

Theorems & Definitions (67)

  • Definition 3.1
  • Proposition 1
  • proof : Proof of Proposition \ref{['prop:neumann_exp']}
  • Example 1: Degree $d=0$: closed form
  • Example 2: Degree $d=0$: sample analog estimator
  • Definition 3.2
  • Definition 4.1: Word
  • Definition 4.2: Position graph
  • Definition 4.3: Partition of a set
  • Definition 4.4: Allocation and design monomial
  • ...and 57 more