Table of Contents
Fetching ...

Tackling Interference Induced by Data Training Loops in A/B Tests: A Weighted Training Approach

Nian Si

TL;DR

This work addresses interference in A/B tests caused by data training loops in recommender systems, where pooled control and treatment data can shift training distributions and bias treatment effect estimates. It introduces a weighted training framework using weights $W_T(X_E,Y_E,Z)=\frac{\mathbb{E}[Z|X_E]}{p}$ and $W_C(X_E,Y_E,Z)=\frac{1-\mathbb{E}[Z|X_E]}{1-p}$, with a weighting model ${G}_{\theta_W}$ to estimate $\mathbb{E}[Z|X_E]$ and apply loss weights, ensuring no distributional shifts and minimizing estimator variance under a no-shift constraint. Theoretical guarantees (Lemma and Theorem) underpin distribution recovery and variance minimization, and extensive simulations show reduced bias and competitive variance compared with data pooling, snapshot, and data-splitting approaches in both AB and AA tests. The findings have practical impact by improving data efficiency and reliability of inference in systems with feedback loops, enabling more robust evaluation of feature and algorithm updates in live recommendations. The work also discusses data-diverted alternatives and outlines future directions for single-model training and improved variance inference in interference-prone settings.

Abstract

In modern recommendation systems, the standard pipeline involves training machine learning models on historical data to predict user behaviors and improve recommendations continuously. However, these data training loops can introduce interference in A/B tests, where data generated by control and treatment algorithms, potentially with different distributions, are combined. To address these challenges, we introduce a novel approach called weighted training. This approach entails training a model to predict the probability of each data point appearing in either the treatment or control data and subsequently applying weighted losses during model training. We demonstrate that this approach achieves the least variance among all estimators that do not cause shifts in the training distributions. Through simulation studies, we demonstrate the lower bias and variance of our approach compared to other methods.

Tackling Interference Induced by Data Training Loops in A/B Tests: A Weighted Training Approach

TL;DR

This work addresses interference in A/B tests caused by data training loops in recommender systems, where pooled control and treatment data can shift training distributions and bias treatment effect estimates. It introduces a weighted training framework using weights and , with a weighting model to estimate and apply loss weights, ensuring no distributional shifts and minimizing estimator variance under a no-shift constraint. Theoretical guarantees (Lemma and Theorem) underpin distribution recovery and variance minimization, and extensive simulations show reduced bias and competitive variance compared with data pooling, snapshot, and data-splitting approaches in both AB and AA tests. The findings have practical impact by improving data efficiency and reliability of inference in systems with feedback loops, enabling more robust evaluation of feature and algorithm updates in live recommendations. The work also discusses data-diverted alternatives and outlines future directions for single-model training and improved variance inference in interference-prone settings.

Abstract

In modern recommendation systems, the standard pipeline involves training machine learning models on historical data to predict user behaviors and improve recommendations continuously. However, these data training loops can introduce interference in A/B tests, where data generated by control and treatment algorithms, potentially with different distributions, are combined. To address these challenges, we introduce a novel approach called weighted training. This approach entails training a model to predict the probability of each data point appearing in either the treatment or control data and subsequently applying weighted losses during model training. We demonstrate that this approach achieves the least variance among all estimators that do not cause shifts in the training distributions. Through simulation studies, we demonstrate the lower bias and variance of our approach compared to other methods.
Paper Structure (15 sections, 2 theorems, 32 equations, 9 figures, 10 tables, 1 algorithm)

This paper contains 15 sections, 2 theorems, 32 equations, 9 figures, 10 tables, 1 algorithm.

Key Result

Lemma 1

The weighted functions satisfy where $\overset{d}{\mathcal{=}}$ means equal in distribution.

Figures (9)

  • Figure 1: A standard pipeline in recommendation system
  • Figure 2: An A/B testing procedure
  • Figure 3: Dependence of different objects in the data training loops, where we omit the subscript $i$ for simplicity
  • Figure 4: A/B testing results for $\alpha_C=10$, $\alpha_T=9$ , and $p = 1/2$
  • Figure 5: A/B testing results for $\alpha_C=10$, $\alpha_T=9$, and $p =0.2$
  • ...and 4 more figures

Theorems & Definitions (5)

  • Example 1: Experimenting parameters of fusion formulas
  • Lemma 1
  • Theorem 1
  • proof : Proof of Lemma \ref{['lma:equal']}
  • proof : Proof of Theorem \ref{['thm:min_var']}