Fast Rerandomization for Balancing Covariates in Randomized Experiments: A Metropolis-Hastings Framework
Jiuyao Lu, Tianruo Zhang, Ke Zhu
TL;DR
This work tackles covariate imbalance in randomized experiments, where rerandomization using the Mahalanobis distance $M(\mathbf{W})$ can achieve strong balance but suffers from prohibitively slow rejection sampling at high dimensions. It introduces a Metropolis-Hastings framework with pair-switching to define a stationary distribution $\pi(\mathbf{W}) \propto M(\mathbf{W})^{-1/T}$ and then applies a sampling-importance resampling step to recover uniform draws over the acceptable set $\mathcal{W}_a$, preserving classical inference guarantees. The resulting PSRSRR algorithm dramatically speeds up sampling by orders of magnitude while maintaining exact and asymptotic validity, demonstrated through extensive simulations and real-data applications (STAR and reserpine). The approach provides a practical, theory-grounded path to strict rerandomization thresholds, with potential extensions to complex experimental designs and covariate-prioritized balance criteria. Overall, PSRSRR makes fast, uniform rerandomization feasible at stringent balance thresholds, enabling robust Fisher randomization tests and efficient estimation in modern causal inference pipelines.
Abstract
Balancing covariates is critical for credible and efficient randomized experiments. Rerandomization addresses this by repeatedly generating treatment assignments until covariate balance meets a prespecified threshold. By shrinking this threshold, it can achieve arbitrarily strong balance, with established results guaranteeing optimal estimation and valid inference in both finite-sample and asymptotic settings across diverse complex experimental settings. Despite its rigorous theoretical foundations, practical use is limited by the extreme inefficiency of rejection sampling, which becomes prohibitively slow under small thresholds and often forces practitioners to adopt suboptimal settings, leading to degraded performance. Existing work focusing on acceleration typically fail to maintain the uniformity over the acceptable assignment space, thus losing the theoretical grounds of classical rerandomization. Building upon a Metropolis-Hastings framework, we address this challenge by introducing an additional sampling-importance resampling step, which restores uniformity and preserves statistical guarantees. Our proposed algorithm, PSRSRR, achieves speedups ranging from 10 to 10,000 times while maintaining exact and asymptotic validity, as demonstrated by simulations and two real-data applications.
