Table of Contents
Fetching ...

Integrating Active Learning in Causal Inference with Interference: A Novel Approach in Online Experiments

Hongtao Zhu, Sizhe Zhang, Yang Su, Zhenyu Zhao, Nan Chen

TL;DR

This paper tackles causal inference in networks where interference is present, challenging the no-interference assumption of traditional Rubin Causal Models. It introduces Active Learning in Causal Inference with Interference (ACI), a framework that combines Gaussian process regression to model direct and spillover effects with an active-learning loop and a genetic-algorithm-based design to optimize experimental treatments under network constraints. The approach yields accurate effect estimates with substantially reduced data requirements, demonstrated through simulations and a Tencent game dataset. Practically, ACI offers a data-efficient pathway for estimating complex interference patterns in online experiments and other networked settings.

Abstract

In the domain of causal inference research, the prevalent potential outcomes framework, notably the Rubin Causal Model (RCM), often overlooks individual interference and assumes independent treatment effects. This assumption, however, is frequently misaligned with the intricate realities of real-world scenarios, where interference is not merely a possibility but a common occurrence. Our research endeavors to address this discrepancy by focusing on the estimation of direct and spillover treatment effects under two assumptions: (1) network-based interference, where treatments on neighbors within connected networks affect one's outcomes, and (2) non-random treatment assignments influenced by confounders. To improve the efficiency of estimating potentially complex effects functions, we introduce an novel active learning approach: Active Learning in Causal Inference with Interference (ACI). This approach uses Gaussian process to flexibly model the direct and spillover treatment effects as a function of a continuous measure of neighbors' treatment assignment. The ACI framework sequentially identifies the experimental settings that demand further data. It further optimizes the treatment assignments under the network interference structure using genetic algorithms to achieve efficient learning outcome. By applying our method to simulation data and a Tencent game dataset, we demonstrate its feasibility in achieving accurate effects estimations with reduced data requirements. This ACI approach marks a significant advancement in the realm of data efficiency for causal inference, offering a robust and efficient alternative to traditional methodologies, particularly in scenarios characterized by complex interference patterns.

Integrating Active Learning in Causal Inference with Interference: A Novel Approach in Online Experiments

TL;DR

This paper tackles causal inference in networks where interference is present, challenging the no-interference assumption of traditional Rubin Causal Models. It introduces Active Learning in Causal Inference with Interference (ACI), a framework that combines Gaussian process regression to model direct and spillover effects with an active-learning loop and a genetic-algorithm-based design to optimize experimental treatments under network constraints. The approach yields accurate effect estimates with substantially reduced data requirements, demonstrated through simulations and a Tencent game dataset. Practically, ACI offers a data-efficient pathway for estimating complex interference patterns in online experiments and other networked settings.

Abstract

In the domain of causal inference research, the prevalent potential outcomes framework, notably the Rubin Causal Model (RCM), often overlooks individual interference and assumes independent treatment effects. This assumption, however, is frequently misaligned with the intricate realities of real-world scenarios, where interference is not merely a possibility but a common occurrence. Our research endeavors to address this discrepancy by focusing on the estimation of direct and spillover treatment effects under two assumptions: (1) network-based interference, where treatments on neighbors within connected networks affect one's outcomes, and (2) non-random treatment assignments influenced by confounders. To improve the efficiency of estimating potentially complex effects functions, we introduce an novel active learning approach: Active Learning in Causal Inference with Interference (ACI). This approach uses Gaussian process to flexibly model the direct and spillover treatment effects as a function of a continuous measure of neighbors' treatment assignment. The ACI framework sequentially identifies the experimental settings that demand further data. It further optimizes the treatment assignments under the network interference structure using genetic algorithms to achieve efficient learning outcome. By applying our method to simulation data and a Tencent game dataset, we demonstrate its feasibility in achieving accurate effects estimations with reduced data requirements. This ACI approach marks a significant advancement in the realm of data efficiency for causal inference, offering a robust and efficient alternative to traditional methodologies, particularly in scenarios characterized by complex interference patterns.
Paper Structure (22 sections, 27 equations, 7 figures, 1 table, 3 algorithms)

This paper contains 22 sections, 27 equations, 7 figures, 1 table, 3 algorithms.

Figures (7)

  • Figure 1: Illustrative causal structure of individual $i$ interfering with individual $j$.
  • Figure 2: The social network of 50 individuals.
  • Figure 3: The structure of our active learning framework designed for causal inference with interference (GA: Genetic algorithm, GRP: Gaussian process regression). A detailed exposition of this framework is provided in Section \ref{['sec:proposed_method']}.
  • Figure 4: Illustrative Sub-Network Centered Around Individual $1$, where connections indicate relationships between individuals, e.g., $w_{1,0} = 1$.
  • Figure 5: Active Learning in Causal Inference with Interference (ACI) method: The light pink and dark blue curves represent the estimated average overall and spillover effects, respectively. Points along these curves represent sequentially selected treatment levels. It should be noted that there may be instances where two updates occur simultaneously, yet not all are depicted graphically; for example, transitioning from subfigure (a) to (b) introduces two additional sequential treatment levels. The actual scenarios of these effects are shown by two orange dashed curves.
  • ...and 2 more figures