Finely Stratified Rerandomization Designs
Max Cytrynbaum
TL;DR
The paper develops a theory for finely stratified rerandomization that combines tight pre-match grouping with within-group rerandomization to satisfy balance on nonlinear covariate features. It shows that such designs implement partially linear regression adjustment by design, yielding nonparametric control over stratification covariates and linear control over rerandomization covariates, and derives GMM-based asymptotics for both finite and superpopulation estimands. It introduces nonlinear rerandomization variants and proves their asymptotic equivalence to linear designs, then optimizes the acceptance region via a minimax criterion, potentially leveraging pilot data. The authors provide a framework for robust inference under stratified rerandomization, including variance bounds for finite-population parameters and ex-post adjustments that restore normality, with simulations and an Angrist 2013 application illustrating gains in estimating treatment-effect heterogeneity. Overall, the work expands the toolkit for causal inference under data-adaptive stratification, offering practical guidance on design choices and inference procedures that exploit both nonparametric balance and semiparametric efficiency.
Abstract
We study estimation and inference on causal parameters under finely stratified rerandomization designs, which use baseline covariates to match units into groups (e.g. matched pairs), then rerandomize within-group treatment assignments until a balance criterion is satisfied. We show that finely stratified rerandomization does partially linear regression adjustment by design, providing nonparametric control over the stratified covariates and linear control over the rerandomized covariates. We introduce several new forms of rerandomization, allowing for imbalance metrics based on nonlinear estimators, and proposing a minimax scheme that minimizes the computational cost of rerandomization subject to a bound on estimation error. While the asymptotic distribution of GMM estimators under stratified rerandomization is generically non-normal, we show how to restore asymptotic normality using ex-post linear adjustment tailored to the stratification. We derive new variance bounds that enable conservative inference on finite population causal parameters, and provide asymptotically exact inference on their superpopulation counterparts.
