Best of Three Worlds: Adaptive Experimentation for Digital Marketing in Practice

Tanner Fiez; Houssam Nassif; Yu-Cheng Chen; Sergio Gamez; Lalit Jain

Best of Three Worlds: Adaptive Experimentation for Digital Marketing in Practice

Tanner Fiez, Houssam Nassif, Yu-Cheng Chen, Sergio Gamez, Lalit Jain

TL;DR

An AED framework for counterfactual inference based on experiences of naively using AED systems in industrial settings where non-stationarity is prevalent is developed, and tested in a commercial environment.

Abstract

Adaptive experimental design (AED) methods are increasingly being used in industry as a tool to boost testing throughput or reduce experimentation cost relative to traditional A/B/N testing methods. However, the behavior and guarantees of such methods are not well-understood beyond idealized stationary settings. This paper shares lessons learned regarding the challenges of naively using AED systems in industrial settings where non-stationarity is prevalent, while also providing perspectives on the proper objectives and system specifications in such settings. We developed an AED framework for counterfactual inference based on these experiences, and tested it in a commercial environment.

Best of Three Worlds: Adaptive Experimentation for Digital Marketing in Practice

TL;DR

Abstract

Paper Structure (37 sections, 4 theorems, 33 equations, 13 figures, 1 algorithm)

This paper contains 37 sections, 4 theorems, 33 equations, 13 figures, 1 algorithm.

Introduction
Contributions.
Real-World Studies and Lessons
Lessons Learned
Our Approach
Estimation with Time Variation
Running Empirical Means.
Cumulative Gain
Cumulative Gain Estimator.
Always-Valid Inference.
Adaptive Counterfactual Inference
Algorithm Description
Experiments and Guarantees
Offline Experiments
Comparison Algorithms.
...and 22 more sections

Key Result

Proposition 1

For any arm $i\in [k]$ and day horizon $T$, the estimator $\widehat{G}_{i,T}=\sum\nolimits_{t=1}^T (r_{i,t}/p_{i,t})$ is unbiased for the cumulative gain. That is, we have $\mathbb{E}[\widehat{G}_{i,T}]=G_{i, T}$ as defined in Equation eq:cumu_gain.

Figures (13)

Figure 1: Case study of time-variation and adaptive allocations causing Simpson's paradox.
Figure 2: Daily empirical means from marketing experiments with uniformly collected data.
Figure 3: Offline experiment 1. The daily arm means (a), regret as a function of the day (b), regret at the stopping time (c), and the probability of identifying the optimal arm with statistical significance by a given day (d).
Figure 4: Live experiment 1: TS catastrophically fails on production data and shifts all traffic to the worst arm.
Figure 5: Live experiment 2: TS fails to obtain significantly higher reward relative to CGSE and gives misleading inferences.
...and 8 more figures

Theorems & Definitions (4)

Proposition 1
Proposition 2
Proposition 3
Proposition 4

Best of Three Worlds: Adaptive Experimentation for Digital Marketing in Practice

TL;DR

Abstract

Best of Three Worlds: Adaptive Experimentation for Digital Marketing in Practice

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (4)