Unbiased Estimation for Total Treatment Effect Under Interference Using Aggregated Dyadic Data

Lu Deng; Yilin Li; JingJing Zhang; Yong Wang; Chuan Chen

Unbiased Estimation for Total Treatment Effect Under Interference Using Aggregated Dyadic Data

Lu Deng, Yilin Li, JingJing Zhang, Yong Wang, Chuan Chen

TL;DR

This work tackles causal inference under network interference by modeling dyadic interactions and proposing an unbiased estimator for the total treatment effect (TTE) using aggregated dyadic data. By combining two Horvitz–Thompson–based estimators—one on dyadic outcomes $Y_i$ and one on diffusion-like dyadic outcomes $D_i$—into $\hat{\tau} = \hat{\tau}^{1} + \hat{\tau}^{2}$, the authors show that the estimator is unbiased for the full population when treatment is assigned with $\pi=0.5$, and that it exhibits reduced bias in subpopulation and two-stage cluster designs. Through extensive simulations on large social-network graphs and a real-world Weixin experiment, they demonstrate the practical advantages of their approach for estimating the total causal effect in settings with interference. The results offer a scalable, data-structure–driven method for improving causal estimates in online platforms where dyadic interactions drive outcomes.

Abstract

In social media platforms, user behavior is often influenced by interactions with other users, complicating the accurate estimation of causal effects in traditional A/B experiments. This study investigates situations where an individual's outcome can be broken down into the sum of multiple pairwise outcomes, a reflection of user interactions. These outcomes, referred to as dyadic data, are prevalent in many social network contexts. Utilizing a Bernoulli randomized design, we introduce a novel unbiased estimator for the total treatment effect (TTE), which quantifies the difference in population mean when all individuals are assigned to treatment versus control groups. We further explore the bias of our estimator in scenarios where it is impractical to include all individuals in the experiment, a common constraint in online control experiments. Our numerical results reveal that our proposed estimator consistently outperforms some commonly used estimators, underscoring its potential for more precise causal effect estimation in social media environments.

Unbiased Estimation for Total Treatment Effect Under Interference Using Aggregated Dyadic Data

TL;DR

and one on diffusion-like dyadic outcomes

—into

, the authors show that the estimator is unbiased for the full population when treatment is assigned with

, and that it exhibits reduced bias in subpopulation and two-stage cluster designs. Through extensive simulations on large social-network graphs and a real-world Weixin experiment, they demonstrate the practical advantages of their approach for estimating the total causal effect in settings with interference. The results offer a scalable, data-structure–driven method for improving causal estimates in online platforms where dyadic interactions drive outcomes.

Abstract

Paper Structure (23 sections, 9 theorems, 27 equations, 5 figures, 1 table)

This paper contains 23 sections, 9 theorems, 27 equations, 5 figures, 1 table.

Introduction
Problem Setting
Potential outcome model
Network Graph
Neighborhood Interference
Dyadic data setup
Methodology
Dyadic Interference
Estimator
Full population experiment
Sub-population experiment
Sub-population two-stage experiment
Numerical Experiments
Simulated data
Real Data
...and 8 more sections

Key Result

Proposition 3.1

$\mathbb{E}[\hat{\tau}^1(\pi)] = \frac{1}{n} \sum_{i \neq j} ( \gamma_{i,j} + \zeta_{i,j}\pi ).$

Figures (5)

Figure 1: An illustration of the calculation of $Y_i$ and $D_i$. Suppose node $i$ has 7 neighbor nodes, where nodes 1,2,3,4 are upstream neighbors, and nodes 5,6,7 are downstream neighbors. Each direct edge is associated with an outcome. $Y_j$ is the summation of the paired outcomes from upstream neighbors, which is $z_{1,j} + z_{2,j} + z_{3,j} + z_{4,j}$. $D_j$ is the summation of the paired outcomes from downstream neighbors, which is $z_{j,5} + z_{j,6} + z_{j,7}$.
Figure 2: Visualization of the two-stage experiment process.
Figure 3: Visualizations of the performance of three TTE estimators under full population Bernoulli design on FB-Standard3 and FB-Cornell5 networks for both Uniform and Bernoulli potential outcomes models. Each line's height represents the relative bias of the estimator.
Figure 4: Visualizations of the performance of three TTE estimators under sub-population Bernoulli design on FB-Standard3 and FB-Cornell5 networks for both Uniform and Bernoulli potential outcomes models. Each line's height represents the relative bias of the estimator.
Figure 5: Visualizations of the performance of three TTE estimators under sub-population two-stage experiment design on FB-Standard3 and FB-Cornell5 networks for both Uniform and Bernoulli potential outcomes models. Each line's height represents the relative bias of the estimator.

Theorems & Definitions (10)

Example 2.1
Proposition 3.1
Proposition 3.2
Theorem 3.1
Proposition 3.3
Proposition 3.4
Theorem 3.2
Proposition 3.5
Proposition 3.6
Theorem 3.3

Unbiased Estimation for Total Treatment Effect Under Interference Using Aggregated Dyadic Data

TL;DR

Abstract

Unbiased Estimation for Total Treatment Effect Under Interference Using Aggregated Dyadic Data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (10)