Table of Contents
Fetching ...

Breaking the Winner's Curse with Bayesian Hybrid Shrinkage

Richard Mudd, Abbas Zaidi, Rina Friedberg, Ilya Gorbachev, Anchal Choubey, Houssam Nassif

Abstract

The widespread adoption of randomized controlled trials (A/B Tests) for decision-making has introduced a pervasive "Winner's Curse": experiments selected for launch often exhibit upwardly biased effect estimates and invalid confidence intervals. This selection bias leads to over-optimistic impact projections and undermines decision-making, particularly in low-power regimes. We propose Bayesian Hybrid Shrinkage (BHS), an empirical Bayes (EB) framework that leverages data-driven priors to mitigate selection bias and provides accurate uncertainty quantification. Unlike traditional EB methods that apply uniform shrinkage, BHS introduces an experiment-specific "local" shrinkage factor that incorporates individual experiment characteristics, improving robustness against prior misspecification. We also derive a closed-form inference strategy designed for high-throughput production environments. Extensive simulations and real-world evaluations at Meta Platforms demonstrate that BHS outperforms existing methods in terms of bias reduction and interval coverage, even under substantial violations of modeling assumptions.

Breaking the Winner's Curse with Bayesian Hybrid Shrinkage

Abstract

The widespread adoption of randomized controlled trials (A/B Tests) for decision-making has introduced a pervasive "Winner's Curse": experiments selected for launch often exhibit upwardly biased effect estimates and invalid confidence intervals. This selection bias leads to over-optimistic impact projections and undermines decision-making, particularly in low-power regimes. We propose Bayesian Hybrid Shrinkage (BHS), an empirical Bayes (EB) framework that leverages data-driven priors to mitigate selection bias and provides accurate uncertainty quantification. Unlike traditional EB methods that apply uniform shrinkage, BHS introduces an experiment-specific "local" shrinkage factor that incorporates individual experiment characteristics, improving robustness against prior misspecification. We also derive a closed-form inference strategy designed for high-throughput production environments. Extensive simulations and real-world evaluations at Meta Platforms demonstrate that BHS outperforms existing methods in terms of bias reduction and interval coverage, even under substantial violations of modeling assumptions.
Paper Structure (20 sections, 4 theorems, 19 equations, 6 figures, 1 table)

This paper contains 20 sections, 4 theorems, 19 equations, 6 figures, 1 table.

Key Result

Theorem 1

Let $\Theta = \sum_{i=1}^{N} \theta_i$ denote the true aggregated treatment effect, and let $\pi(\Theta \mid \boldsymbol{\hat{\theta}}_m)$ denote the posterior distribution of this aggregate effect given the vector of observed estimates $\boldsymbol{\hat{\theta}}_m = (\hat{\theta}_{1m}, \dots, \hat{

Figures (6)

  • Figure 1: $\delta_i = \hat{\theta}_i - \theta_i$ across $N = 10,000$ simulated experiments for the Face Value Frequentist estimator and the Bayesian estimator with Global Shrinkage.
  • Figure 2: MSE (top), bias ($\Delta$, middle) and coverage probability (bottom) as a function of sampling variance for face value, Bayesian with global shrinkage and BHS approaches.
  • Figure 3: MSE (left), bias ($\Delta$, center) and coverage probability (right) for face value, Bayesian with global shrinkage, and BHS approaches, as a function of prior mean ($\mu$, top), degrees of freedom ($\nu$, middle) and correlation ($\rho$, bottom).
  • Figure 4: Comparison of estimators across a collection of real-world experiments with paired replication studies for two different business units.
  • Figure 5: Distribution of errors for different estimators across a collection of real-world experiments with paired replication studies.
  • ...and 1 more figures

Theorems & Definitions (8)

  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Theorem 3
  • proof
  • Theorem 4
  • proof