Estimating Treatment Effects under Recommender Interference: A Structured Neural Networks Approach
Ruohan Zhan, Shichao Han, Yuchen Hu, Zhenling Jiang
TL;DR
This work addresses bias in creator-side recommender experiments caused by interference among items competing for exposure. It introduces a structured neural framework combining a semi-parametric recommender choice model with neural nets and a neural viewer-response model, paired with a debiased Double/Debiased ML estimator to achieve $\sqrt{n}$-consistent inference under correlated data. The authors prove Neyman orthogonality for the debiased estimator, extend DML to handle correlated samples, and validate the method through Monte Carlo simulations and a large-scale Weixin field experiment against ground-truth from a costly double-sided design. The results show that the proposed approach reliably recovers the ground truth while standard DIM and propensity-based estimators can produce biased or even sign-reversed estimates, offering a practically valuable tool for high-stakes policy decisions in online platforms.
Abstract
Recommender systems are essential for content-sharing platforms by curating personalized content. To improve recommender systems, platforms frequently rely on creator-side randomized experiments to evaluate algorithm updates. We show that commonly adopted difference-in-means estimators can lead to severely biased estimates due to recommender interference, where treated and control creators compete for exposure. This bias can result in incorrect business decisions. To address this, we propose a ``recommender choice model'' that explicitly represents the interference pathway. The approach combines a structural choice framework with neural networks to account for rich viewer-content heterogeneity. Building on this foundation, we develop a debiased estimator using the double machine learning (DML) framework to adjust for errors from nuisance component estimation. We show that the estimator is $\sqrt{n}$-consistent and asymptotically normal, and we extend the DML theory to handle correlated data, which arise in our context due to overlapped items. We validate our method with a large-scale field experiment on Weixin short-video platform, using a costly double-sided randomization design to obtain an interference-free ground truth. Our results show that the proposed estimator successfully recovers this ground truth, whereas benchmark estimators exhibit substantial bias, and in some cases, yield reversed signs.
