Evaluating A/B Testing Methodologies via Sample Splitting: Theory and Practice

Ryan Kessler; James McQueen; Miikka Rokkanen

Evaluating A/B Testing Methodologies via Sample Splitting: Theory and Practice

Ryan Kessler, James McQueen, Miikka Rokkanen

Abstract

We develop a theoretical framework for sample splitting in A/B testing environments, where data for each test are partitioned into two splits to measure methodological performance when the true impacts of tests are unobserved. We show that sample-split estimators are generally biased for full-sample performance but consistently estimate sample-split analogues of it. We derive their asymptotic distributions, construct valid confidence intervals, and characterize the bias-variance trade-offs underlying sample-split design choices. We validate our theoretical results through simulations and provide implementation guidance for A/B testing products seeking to evaluate new estimators and decision rules.

Evaluating A/B Testing Methodologies via Sample Splitting: Theory and Practice

Abstract

Evaluating A/B Testing Methodologies via Sample Splitting: Theory and Practice

Abstract

Paper Structure

Table of Contents

Figures (5)