Using Prior Studies to Design Experiments: An Empirical Bayes Approach

Zhiheng You

Using Prior Studies to Design Experiments: An Empirical Bayes Approach

Zhiheng You

Abstract

We develop an empirical Bayes framework for experimental design that leverages information from prior related studies. When a researcher has access to estimates from previous studies on similar parameters, they can use empirical Bayes to estimate an informative prior over the parameter of interest in the new study. We show how this prior can be incorporated into a decision-theoretic experimental design framework to choose optimal design. The approach is illustrated via propensity score designs in stratified randomized experiments. Our theoretical results show that the empirical Bayes design achieves oracle-optimal performance as the number of prior studies grows, and characterize the rate at which regret vanishes. To illustrate the approach, we present two empirical applications--oncology drug trials and the Tennessee Project STAR experiment. Our framework connects the Bayesian meta-analysis literature to experimental design and provides practical guidance for researchers seeking to design more efficient experiments.

Using Prior Studies to Design Experiments: An Empirical Bayes Approach

Abstract

Paper Structure (43 sections, 26 theorems, 177 equations, 15 figures, 6 tables)

This paper contains 43 sections, 26 theorems, 177 equations, 15 figures, 6 tables.

Introduction
General Setup
Estimating the Prior from Previous Studies
Gaussian Prior
Nonparametric Prior via NPMLE
Propensity Score Designs in Stratified Randomized Experiments
Design Class and Design-Induced Likelihood
Prior-Study Information
EB Designs and No-Information Benchmarks
Optimal EB Designs Under Different Objectives
Theoretical Analysis
Regret Bounds and Gains over No Prior Evidence
Regret Consistency
Rates of Convergence
Characterization of Gaussian Experiments
...and 28 more sections

Key Result

Proposition 1

Suppose $Q=\mathcal{N}(m,V)$ and Assumption ass:gaussian_sampling holds. Then the posterior covariance of $\theta$ given $(\hat{\theta},e)$ is which depends on the design but not on the realized data. Moreover, the ex-ante Bayes risk for estimating $L\theta$ equals Under a diffuse no-information prior (formally $V^{-1}\to 0$), the criterion reduces to minimizing $\mathrm{tr}(\Lambda\,L \Sigma(e)

Figures (15)

Figure 1: OS estimated prior marginals: Gaussian and NPMLE
Figure 2: School-level treatment-effect estimates by stratum
Figure 3: Estimated prior distributions and moments (joint priors)
Figure 4: Optimal treatment propensities by objective under EB priors
Figure A-1: OS prior-study estimates by PD-L1 subgroup
...and 10 more figures

Theorems & Definitions (37)

Proposition 1: Quadratic-loss design criterion under a Gaussian prior
Proposition 2: Optimal EB propensity design for in-experiment welfare
Lemma 1: Distribution of the posterior mean
Proposition 3: Optimal design under Gaussian model
Remark 1: Experiment with noncompliance
Theorem 1: Finite-sample oracle inequality
Corollary 1: EB versus no-information benchmark
Remark 2: A less conservative sufficient condition
Example 1: Two-stratum quadratic loss
Theorem 2: Regret consistency
...and 27 more

Using Prior Studies to Design Experiments: An Empirical Bayes Approach

Abstract

Using Prior Studies to Design Experiments: An Empirical Bayes Approach

Authors

Abstract

Table of Contents

Key Result

Figures (15)

Theorems & Definitions (37)