Table of Contents
Fetching ...

Sharp variance estimator and causal bootstrap in stratified randomized experiments

Haoyang Yu, Ke Zhu, Hanzhong Liu

TL;DR

This work advances inference in stratified randomized experiments by introducing a sharp, Consistent variance estimator and two design-based causal bootstrap methods. The rank-preserving imputation bootstrap offers second-order refinement over normal approximation in strata with sufficient sizes, while the constant-treatment-effect imputation extends to paired designs. Through theory (consistency, Edgeworth expansions) and extensive simulations plus real-data applications, the authors show improved finite-sample coverage and shorter confidence intervals, particularly under nonnormal or heavy-tailed outcomes. An accompanying R package, CausalBootstrap, enables practical implementation. These methods enhance reliable, model-lean causal inference in complex randomization schemes.

Abstract

Randomized experiments are the gold standard for estimating treatment effects, and randomization serves as a reasoned basis for inference. In widely used stratified randomized experiments, randomization-based finite-population asymptotic theory enables valid inference for the average treatment effect, relying on normal approximation and a Neyman-type conservative variance estimator. However, when the sample size is small or the outcomes are skewed, the Neyman-type variance estimator may become overly conservative, and the normal approximation can fail. To address these issues, we propose a sharp variance estimator and two causal bootstrap methods to more accurately approximate the sampling distribution of the weighted difference-in-means estimator in stratified randomized experiments. The first causal bootstrap procedure is based on rank-preserving imputation and we prove its second-order refinement over normal approximation. The second causal bootstrap procedure is based on constant-treatment-effect imputation and is further applicable in paired experiments. In contrast to traditional bootstrap methods, where randomness originates from hypothetical super-population sampling, our analysis for the proposed causal bootstrap is randomization-based, relying solely on the randomness of treatment assignment in randomized experiments. Numerical studies and two real data applications demonstrate advantages of our proposed methods in finite samples. The \texttt{R} package \texttt{CausalBootstrap} implementing our method is publicly available.

Sharp variance estimator and causal bootstrap in stratified randomized experiments

TL;DR

This work advances inference in stratified randomized experiments by introducing a sharp, Consistent variance estimator and two design-based causal bootstrap methods. The rank-preserving imputation bootstrap offers second-order refinement over normal approximation in strata with sufficient sizes, while the constant-treatment-effect imputation extends to paired designs. Through theory (consistency, Edgeworth expansions) and extensive simulations plus real-data applications, the authors show improved finite-sample coverage and shorter confidence intervals, particularly under nonnormal or heavy-tailed outcomes. An accompanying R package, CausalBootstrap, enables practical implementation. These methods enhance reliable, model-lean causal inference in complex randomization schemes.

Abstract

Randomized experiments are the gold standard for estimating treatment effects, and randomization serves as a reasoned basis for inference. In widely used stratified randomized experiments, randomization-based finite-population asymptotic theory enables valid inference for the average treatment effect, relying on normal approximation and a Neyman-type conservative variance estimator. However, when the sample size is small or the outcomes are skewed, the Neyman-type variance estimator may become overly conservative, and the normal approximation can fail. To address these issues, we propose a sharp variance estimator and two causal bootstrap methods to more accurately approximate the sampling distribution of the weighted difference-in-means estimator in stratified randomized experiments. The first causal bootstrap procedure is based on rank-preserving imputation and we prove its second-order refinement over normal approximation. The second causal bootstrap procedure is based on constant-treatment-effect imputation and is further applicable in paired experiments. In contrast to traditional bootstrap methods, where randomness originates from hypothetical super-population sampling, our analysis for the proposed causal bootstrap is randomization-based, relying solely on the randomness of treatment assignment in randomized experiments. Numerical studies and two real data applications demonstrate advantages of our proposed methods in finite samples. The \texttt{R} package \texttt{CausalBootstrap} implementing our method is publicly available.
Paper Structure (12 sections, 8 theorems, 10 equations, 2 figures, 4 tables)

This paper contains 12 sections, 8 theorems, 10 equations, 2 figures, 4 tables.

Key Result

Proposition 1

If Conditions cond1--cond3 hold, the standard statistic $\sqrt n(\hat{\tau}-\tau)/\sigma$ converges in distribution to a standard normal distribution, denoted as $\mathcal{N}(0,1)$. Furthermore, if $2 \leq n_{[m] 1} \leq n_{[m] }-2$ for $m=1,\dots,M$, the Neyman-type variance estimator $\hat{\sigma}

Figures (2)

  • Figure 1: Density plot and Q-Q plot for the outcomes from the clinical trial for cannabis cessation.
  • Figure 2: Density plot and Q-Q plot for the outcomes from the public health field experiment in Mexico.

Theorems & Definitions (10)

  • Proposition 1
  • Definition 1: Co-monotonicity
  • Proposition 2
  • Theorem 1: Consistency of sharp variance estimator
  • Remark 1
  • Theorem 2: Bootstrap CLT
  • Theorem 3: Edgeworth expansion
  • Theorem 4: Second-order refinement
  • Proposition 3
  • Theorem 5: Bootstrap CLT