Table of Contents
Fetching ...

A Simple Model to Estimate Sharing Effects in Social Networks

Olivier Jeunen

TL;DR

The paper addresses interference in A/B testing on social networks caused by sharing. It introduces an MDP-based model for user sharing and derives an unbiased estimator, the Differences-in-Geometrics, by expressing policy value as a geometric sum, e.g., $V(pi_a) = 1/(1 - gamma_a)$. The estimator uses estimated end-probabilities via gamma_a and is state-agnostic, showing unbiasedness under a mild independence assumption and outperforming existing methods in synthetic experiments. This approach provides a practical method for accurately quantifying sharing effects to guide feature rollout decisions.

Abstract

Randomised Controlled Trials (RCTs) are the gold standard for estimating treatment effects across many fields of science. Technology companies have adopted A/B-testing methods as a modern RCT counterpart, where end-users are randomly assigned various system variants and user behaviour is tracked continuously. The objective is then to estimate the causal effect that the treatment variant would have on certain metrics of interest to the business. When the outcomes for randomisation units -- end-users in this case -- are not statistically independent, this obfuscates identifiability of treatment effects, and harms decision-makers' observability of the system. Social networks exemplify this, as they are designed to promote inter-user interactions. This interference by design notoriously complicates measurement of, e.g., the effects of sharing. In this work, we propose a simple Markov Decision Process (MDP)-based model describing user sharing behaviour in social networks. We derive an unbiased estimator for treatment effects under this model, and demonstrate through reproducible synthetic experiments that it outperforms existing methods by a significant margin.

A Simple Model to Estimate Sharing Effects in Social Networks

TL;DR

The paper addresses interference in A/B testing on social networks caused by sharing. It introduces an MDP-based model for user sharing and derives an unbiased estimator, the Differences-in-Geometrics, by expressing policy value as a geometric sum, e.g., . The estimator uses estimated end-probabilities via gamma_a and is state-agnostic, showing unbiasedness under a mild independence assumption and outperforming existing methods in synthetic experiments. This approach provides a practical method for accurately quantifying sharing effects to guide feature rollout decisions.

Abstract

Randomised Controlled Trials (RCTs) are the gold standard for estimating treatment effects across many fields of science. Technology companies have adopted A/B-testing methods as a modern RCT counterpart, where end-users are randomly assigned various system variants and user behaviour is tracked continuously. The objective is then to estimate the causal effect that the treatment variant would have on certain metrics of interest to the business. When the outcomes for randomisation units -- end-users in this case -- are not statistically independent, this obfuscates identifiability of treatment effects, and harms decision-makers' observability of the system. Social networks exemplify this, as they are designed to promote inter-user interactions. This interference by design notoriously complicates measurement of, e.g., the effects of sharing. In this work, we propose a simple Markov Decision Process (MDP)-based model describing user sharing behaviour in social networks. We derive an unbiased estimator for treatment effects under this model, and demonstrate through reproducible synthetic experiments that it outperforms existing methods by a significant margin.
Paper Structure (5 sections, 9 equations, 2 figures)

This paper contains 5 sections, 9 equations, 2 figures.

Figures (2)

  • Figure 1: An example trajectory from our MDP: session $s_{1}$ which was assigned system variant $a_{1}$ leads to session $s_{2}$ (variant $a_{2}$), which leads to $s_{3}$ (variant $a_{3}$), and finally $s_{4}$ (variant $a_{2}$). We wish to estimate the expectation of trajectory lengths under constant actions (i.e. shipping a variant to all users).
  • Figure 2: Treatment effect estimation errors for a synthetic setup simulating sharing effects, showing 95% confidence intervals over 32 repeated runs. We observe that the Differences-in-Geometrics estimator performs favourably compared to alternatives.