Table of Contents
Fetching ...

Data-Driven Switchback Experiments: Theoretical Tradeoffs and Empirical Bayes Designs

Ruoxuan Xiong, Alex Chin, Sean J. Taylor

Abstract

We study the design and analysis of switchback experiments conducted on a single aggregate unit. The design problem is to partition the continuous time space into intervals and switch treatments between intervals, in order to minimize the estimation error of the treatment effect. We show that the estimation error depends on four factors: carryover effects, periodicity, serially correlated outcomes, and impacts from simultaneous experiments. We derive a rigorous bias-variance decomposition and show the tradeoffs of the estimation error from these factors. The decomposition provides three new insights in choosing a design: First, balancing the periodicity between treated and control intervals reduces the variance; second, switching less frequently reduces the bias from carryover effects while increasing the variance from correlated outcomes, and vice versa; third, randomizing interval start and end points reduces both bias and variance from simultaneous experiments. Combining these insights, we propose a new empirical Bayes design approach. This approach uses prior data and experiments for designing future experiments. We illustrate this approach using real data from a ride-sharing platform, yielding a design that reduces MSE by 33% compared to the status quo design used on the platform.

Data-Driven Switchback Experiments: Theoretical Tradeoffs and Empirical Bayes Designs

Abstract

We study the design and analysis of switchback experiments conducted on a single aggregate unit. The design problem is to partition the continuous time space into intervals and switch treatments between intervals, in order to minimize the estimation error of the treatment effect. We show that the estimation error depends on four factors: carryover effects, periodicity, serially correlated outcomes, and impacts from simultaneous experiments. We derive a rigorous bias-variance decomposition and show the tradeoffs of the estimation error from these factors. The decomposition provides three new insights in choosing a design: First, balancing the periodicity between treated and control intervals reduces the variance; second, switching less frequently reduces the bias from carryover effects while increasing the variance from correlated outcomes, and vice versa; third, randomizing interval start and end points reduces both bias and variance from simultaneous experiments. Combining these insights, we propose a new empirical Bayes design approach. This approach uses prior data and experiments for designing future experiments. We illustrate this approach using real data from a ride-sharing platform, yielding a design that reduces MSE by 33% compared to the status quo design used on the platform.
Paper Structure (54 sections, 19 theorems, 173 equations, 20 figures)

This paper contains 54 sections, 19 theorems, 173 equations, 20 figures.

Key Result

Theorem 4.1

Suppose Assumptions ass:exogeneity and ass:interference-structure hold, $W^{(m)}$ is independent in $m$ with $\mathbf{P}(W^{(m)} = 1) = 1/2$, and $W^{\mathrm{s}(m)}$ is independent in $m$ with $\mathbf{P}(W^{\mathrm{s}(m)} = 1) = 1/2$. Moreover, $\bm{W}$ is independent of $\bm{W}^\mathrm{s}$. The e where

Figures (20)

  • Figure 1: Illustration of four factors that affect the estimation error of GATE. This figure is generated using the data on a ride-sharing platform. In the illustration of (toy) switchback designs for both primary and simultaneous experiments, dash lines are switching points, and treated intervals are shaded. Each interval has a fixed length of $56$ minutes. There may be one or more experiments running simultaneously with the primary experiment. In the illustration of cumulative effect curves (CECs), the horizontal line is at the value of $0$. The CEC of the primary experiment is unknown ex-ante, but we can estimate a distribution of CECs from prior experiments. In the illustration of periodicity, the event density for the time in a week is shown, where the event can be a rider opening the app and checking the price. More illustrations of the periodic patterns are shown in Section \ref{['sec:empirical']}. In the illustration of serially correlated outcomes, the serial correlation in outcomes decreases with the absolute value of the time difference between two outcomes.
  • Figure 2: An illustration of cumulative effect $\delta^\mathrm{cum}_{t}(\Delta t)$ at time $t$ after being treated for a duration of $\Delta t$ in the primary experiment, while holding the simultaneous intervention in the control state ($\bm{W}^s_t = \bm{0}_t$). When $\Delta t$ grows to infinity, $\delta^\mathrm{cum}_{t}(\Delta t)$ converges to $\delta^\mathrm{gate}_t$.
  • Figure 3: Empirical Bayes approach for switchback designs. See Section \ref{['subsec:design']} for more details.
  • Figure 4: Illustration of various designs and daily periodic event density (dash lines are switching points, and treated intervals are shaded) on Mondays in a two-week experiment. The design on other days in a week is analogous. If a design is not balanced, the design on each day is independent of the design on other days; otherwise, the design in the second week mirrors the design in the first week, and in the first, the design on each day is independent of the design on other days.
  • Figure 5: Event density, standardized mean control outcome (denoted by $Z_{\mu_{Y,t}}$), and standardized heteroscedastic measurement error (denoted by $Z_{\sigma_{t}}$) from Monday 12 AM to Sunday 11:59 PM.
  • ...and 15 more figures

Theorems & Definitions (53)

  • Example 2.1: Uniform event density
  • Example 2.2: Periodic event density
  • Remark 2.1: Marketplace and event outcomes
  • Remark 2.2: Number of aggregate units
  • Example 2.3: Fixed duration switchback
  • Example 2.4: Poisson duration switchback
  • Example 2.5: Change-of-measure switchback
  • Example 2.6: Balanced randomized design
  • Example 4.1: Uniform carryover kernel
  • Example 4.2: Linear decay carryover kernel
  • ...and 43 more