Table of Contents
Fetching ...

Semiparametric Efficient Inference in Adaptive Experiments

Thomas Cook, Alan Mishler, Aaditya Ramdas

Abstract

We consider the problem of efficient inference of the Average Treatment Effect in a sequential experiment where the policy governing the assignment of subjects to treatment or control can change over time. We first provide a central limit theorem for the Adaptive Augmented Inverse-Probability Weighted estimator, which is semiparametric efficient, under weaker assumptions than those previously made in the literature. This central limit theorem enables efficient inference at fixed sample sizes. We then consider a sequential inference setting, deriving both asymptotic and nonasymptotic confidence sequences that are considerably tighter than previous methods. These anytime-valid methods enable inference under data-dependent stopping times (sample sizes). Additionally, we use propensity score truncation techniques from the recent off-policy estimation literature to reduce the finite sample variance of our estimator without affecting the asymptotic variance. Empirical results demonstrate that our methods yield narrower confidence sequences than those previously developed in the literature while maintaining time-uniform error control.

Semiparametric Efficient Inference in Adaptive Experiments

Abstract

We consider the problem of efficient inference of the Average Treatment Effect in a sequential experiment where the policy governing the assignment of subjects to treatment or control can change over time. We first provide a central limit theorem for the Adaptive Augmented Inverse-Probability Weighted estimator, which is semiparametric efficient, under weaker assumptions than those previously made in the literature. This central limit theorem enables efficient inference at fixed sample sizes. We then consider a sequential inference setting, deriving both asymptotic and nonasymptotic confidence sequences that are considerably tighter than previous methods. These anytime-valid methods enable inference under data-dependent stopping times (sample sizes). Additionally, we use propensity score truncation techniques from the recent off-policy estimation literature to reduce the finite sample variance of our estimator without affecting the asymptotic variance. Empirical results demonstrate that our methods yield narrower confidence sequences than those previously developed in the literature while maintaining time-uniform error control.
Paper Structure (45 sections, 9 theorems, 96 equations, 6 figures)

This paper contains 45 sections, 9 theorems, 96 equations, 6 figures.

Key Result

Theorem 1

Assume $\{(X_t, A_t, Y_t ) \}_{t=1}^{T}$ follow the data generating process described in Section sec:dgp. Let $\tilde{\pi}_t: \mathcal{A} \times \mathcal{X} \mapsto (0, 1)$ be an arbitrary sequence of policies, and let $\pi_t$ be the corresponding truncated policies as defined in eq:pi_arbitrary. As where $\sigma^2$ is the semiparametric lower bound of the asymptotic variance for regular estimator

Figures (6)

  • Figure 1: A single run of an experiment with bounded outcomes and the $ATE$ set to $0.4$ (simulation setup of Appendix \ref{['appdx:implementation_bounded']} with $\pi_t \in [0.3,0.7]$). We propose confidence sequences (AsympCS, Pr-PI, Hedged) that are narrower than previous work kato2021.
  • Figure 2: Cumulative error probability (a, c) and power (b, d) as functions of sample size, of experiments from Appendix \ref{['appdx:implementation_bernoulli']} and Appendix \ref{['appdx:implementation_bounded']}. The first row corresponds to the experiment with Bernoulli outcome, and the bottom row corresponds to the experiment with bounded, continuous outcomes. Intervals based on the CLT (Theorem \ref{['theorem:MDS-CLT']}), AsympCS (Theorem \ref{['theorem:asympCS']}), Pr-PI (Theorem \ref{['theorem:prpl_empbern']}), Hedged (Theorem \ref{['theorem:bettingCS']}), and kato2021 begin at $t = 50$.
  • Figure 3: When $\pi_t$ is bounded in a narrower range, intervals produced by a Pr-PI CS are narrower at smaller $t$.
  • Figure 4: Utilizing a kNN regressor for the protocol used in Figure \ref{['fig:miscoverage-bernoulli']}. The policy used for Pr-PI is modified to be truncated within $[0.2,0.8]$.
  • Figure 5: Results for simulation described in Appendix \ref{['appdx:implementation_bounded']} using a k-Nearest Neighbor regressor.
  • ...and 1 more figures

Theorems & Definitions (13)

  • Remark 1
  • Theorem 1: Asymptotic Distribution of $\hat{\theta}_T^{\mathrm{A2IPW}}$
  • Remark 2: Semiparametric Efficiency
  • Theorem 2: Hedged-CS [Hedged]
  • Theorem 3: Predictable Plug-In Empirical Bernstein CS [Pr-PI]
  • Theorem 4: Asymptotic CS [AsympCS]
  • Theorem 5: MDS Central Limit Theorem
  • Lemma 1: Convergence of \ref{['eq:term1mds']}
  • Lemma 2: Convergence of \ref{['eq:term2mds']}
  • Lemma 3: Convergence of \ref{['eq:term3mds']}
  • ...and 3 more