Table of Contents
Fetching ...

Fixed-Horizon Self-Normalized Inference for Adaptive Experiments via Martingale AIPW/DML with Logged Propensities

Gabriel Saco

Abstract

Adaptive randomized experiments update treatment probabilities as data accrue, but still require an end-of-study interval for the average treatment effect (ATE) at a prespecified horizon. Under adaptive assignment, propensities can keep changing, so the predictable quadratic variation of AIPW/DML score increments may remain random. When no deterministic variance limit exists, Wald statistics normalized by a single long-run variance target can be conditionally miscalibrated given the realized variance regime. We assume no interference, sequential randomization, i.i.d. arrivals, and executed overlap on a prespecified scored set, and we require two auditable pipeline conditions: the platform logs the executed randomization probability for each unit, and the nuisance regressions used to score unit $t$ are constructed predictably from past data only. These conditions make the centered AIPW/DML scores an exact martingale difference sequence. Using self-normalized martingale limit theory, we show that the Studentized statistic, with variance estimated by realized quadratic variation, is asymptotically N(0,1) at the prespecified horizon, even without variance stabilization. Simulations validate the theory and highlight when standard fixed-variance Wald reporting fails.

Fixed-Horizon Self-Normalized Inference for Adaptive Experiments via Martingale AIPW/DML with Logged Propensities

Abstract

Adaptive randomized experiments update treatment probabilities as data accrue, but still require an end-of-study interval for the average treatment effect (ATE) at a prespecified horizon. Under adaptive assignment, propensities can keep changing, so the predictable quadratic variation of AIPW/DML score increments may remain random. When no deterministic variance limit exists, Wald statistics normalized by a single long-run variance target can be conditionally miscalibrated given the realized variance regime. We assume no interference, sequential randomization, i.i.d. arrivals, and executed overlap on a prespecified scored set, and we require two auditable pipeline conditions: the platform logs the executed randomization probability for each unit, and the nuisance regressions used to score unit are constructed predictably from past data only. These conditions make the centered AIPW/DML scores an exact martingale difference sequence. Using self-normalized martingale limit theory, we show that the Studentized statistic, with variance estimated by realized quadratic variation, is asymptotically N(0,1) at the prespecified horizon, even without variance stabilization. Simulations validate the theory and highlight when standard fixed-variance Wald reporting fails.
Paper Structure (37 sections, 13 theorems, 65 equations, 1 figure, 11 tables)

This paper contains 37 sections, 13 theorems, 65 equations, 1 figure, 11 tables.

Key Result

Lemma 4.6

Under forward cross-fitting (Definition def:forward), if for each block $I_k$ the analyst fits $(\widehat{m}_0^{(-k)},\widehat{m}_1^{(-k)})$ using only data from blocks $I_1,\dots,I_{k-1}$ and then reuses these fitted objects unchanged for all $t\in I_k$, then Assumption ass:predictable_nuis holds.

Figures (1)

  • Figure 1: Design A: distribution across replications of the regime-dependent long-run variance proxy. In this design, $V_{\mathcal{T}}^2/n_{\mathrm{eff}}$ converges to $16.25$ or $46.25$ depending on the burn-in sign, so there is no deterministic variance limit (Assumption \ref{['ass:stab']} fails). This motivates studentization by realized quadratic variation in the SN CI \ref{['eq:ci']}.

Theorems & Definitions (48)

  • Definition 3.1: Adaptive logged executed propensity
  • Remark 3.3: Auditing logged propensities
  • Remark 3.4: Timing and measurability
  • Remark 3.11: How to enforce Assumption \ref{['ass:nuis_stab']}
  • Remark 3.12: Weaker moment conditions are possible but are not needed for this paper
  • Remark 3.13: Design-stage, analysis-stage and auditable conditions
  • Remark 4.1: Terminology
  • Remark 4.2: Multi-arm extensions
  • Definition 4.3: Forward cross-fitting
  • Remark 4.4: Common pitfall
  • ...and 38 more