Adaptive Neyman Allocation

Jinglong Zhao

Adaptive Neyman Allocation

Jinglong Zhao

TL;DR

This work develops a competitive-analysis framework for adaptive Neyman allocation in multi-stage experiments with unknown variances. It introduces simple, batched strategies (two-stage and multi-stage) that adaptively estimate standard deviations from earlier stages and allocate units accordingly, achieving near-optimal second-order efficiency as the number of stages grows. Theoretical results include high-probability and in-expectation competitive guarantees and an information-theoretic lower bound, along with valid estimation and inference for adaptively collected data under stability conditions. Empirical validation on online A/B testing data and synthetic simulations demonstrates meaningful variance reductions and reliable inference, providing practical guidance for designing efficient multi-stage experiments in domains with heterogeneous treatment effects. This approach offers a principled path to allocate experimental units across treated and control groups when variance heterogeneity is present and only observed data can inform future allocations.

Abstract

In the experimental design literature, Neyman allocation refers to the practice of allocating units into treated and control groups, potentially in unequal numbers proportional to their respective standard deviations, with the objective of minimizing the variance of the treatment effect estimator. This widely recognized approach increases statistical power in scenarios where the treated and control groups have different standard deviations, as is often the case in social experiments, clinical trials, marketing research, and online A/B testing. However, Neyman allocation cannot be implemented unless the standard deviations are known in advance. Fortunately, the multi-stage nature of the aforementioned applications allows the use of earlier stage observations to estimate the standard deviations, which further guide allocation decisions in later stages. In this paper, we introduce a competitive analysis framework to study this multi-stage experimental design problem. We propose a simple adaptive Neyman allocation algorithm, which almost matches the information-theoretic limit of conducting experiments. We provide theory for estimation and inference using data collected from our adaptive Neyman allocation algorithm. We demonstrate the effectiveness of our adaptive Neyman allocation algorithm using both online A/B testing data from a social media site and synthetic data.

Adaptive Neyman Allocation

TL;DR

Abstract

Paper Structure (61 sections, 36 theorems, 604 equations, 13 figures, 5 tables, 3 algorithms)

This paper contains 61 sections, 36 theorems, 604 equations, 13 figures, 5 tables, 3 algorithms.

Introduction
Related Literature
Problem Setup
An Optimization Framework
Two-Stage Adaptive Neyman Allocation
Multi-Stage Adaptive Neyman Allocation
Post-Experiment Analysis Using Adaptively Collected Data
Extensions
Simulations Using Online A/B Testing Data
Simulations Using Synthetic Data
Conclusions
Intuitions Behind Algorithm Design
Further Extensions
Useful Lemmas
Martingale Central Limit Theorem
...and 46 more sections

Key Result

Theorem 1

The optimal solution to is given by $T(1) = T(0) = T / 2$. The supremum of the inner optimization problem is achieved when either the treated group or the control group has zero variance, that is, $\sigma(1) = 0$ or $\sigma(0) = 0$.

Figures (13)

Figure 1: Distributions of the number of clicks per million impressions at a social media site AB_testing_kaggle
Figure 2: Competitive ratios with respect to different numbers of stages
Figure 3: Simulated variances of experiments under different numbers of stages
Figure 4: Simulated distributions of experiments under different numbers of stages
Figure 5: Normalized mean squared error with respect to sample size when $\sigma(1) / \sigma(0)=5$
...and 8 more figures

Theorems & Definitions (37)

Theorem 1
Theorem 2
Theorem 3
Theorem 4
Example 1: Symmetric Distribution Implies No Conditioning Bias
Theorem 5: Finite Sample Unbiasedness
Theorem 6: Asymptotic Normality
Proposition 1: Sample Variance Estimator Consistency
Corollary 1
Corollary 2
...and 27 more

Adaptive Neyman Allocation

TL;DR

Abstract

Adaptive Neyman Allocation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (37)