Unbiased Platform-Level Causal Estimation for Search Systems: A Competitive Isolation PSM-DID Framework
Ying Song, Yijing Wang, Hui Yang, Weihan Jin, Jun Xiong, Congyi Zhou, Jialin Zhu, Xiang Gao, Rong Chen, HuaGuang Deng, Ying Dai, Fei Xiao, Haihong Tang, Bo Zheng, KaiFu Zhang
TL;DR
This work tackles the challenge of estimating platform-level causal effects in search-based two-sided marketplaces under widespread interference. It presents the Competitive Isolation PSM-DID (CI-PSM-DID) framework, which fuses min-cut mutual-exclusion graph partitioning with Stratified CTCVR Matching and a two-sided sinking DID to yield unbiased estimates that align with perfect A/B testing under the assumptions of mutual exclusion and parallel trends, i.e., $\hat{\tau} \equiv \Delta^{*}$. The authors provide theoretical guarantees, scalable algorithms, and extensive offline and online validation, achieving substantial reductions in cannibalization and estimation variance, and demonstrating actionable platform-level lift measurements (e.g., GMV and order volume) at scale. An open dataset is released to support reproducible research on marketplace interference, enhancing transparency and transferability of the framework. Overall, CI-PSM-DID offers a practical, scalable tool for robust platform-level causal inference in marketplaces where cross-unit interference is pervasive.
Abstract
Evaluating platform-level interventions in search-based two-sided marketplaces is fundamentally challenged by systemic effects such as spillovers and network interference. While widely used for causal inference, the PSM (Propensity Score Matching) - DID (Difference-in-Differences) framework remains susceptible to selection bias and cross-unit interference from unaccounted spillovers. In this paper, we introduced Competitive Isolation PSM-DID, a novel causal framework that integrates propensity score matching with competitive isolation to enable platform-level effect measurement (e.g., order volume, GMV) instead of item-level metrics in search systems. Our approach provides theoretically guaranteed unbiased estimation under mutual exclusion conditions, with an open dataset released to support reproducible research on marketplace interference (github.com/xxxx). Extensive experiments demonstrate significant reductions in interference effects and estimation variance compared to baseline methods. Successful deployment in a large-scale marketplace confirms the framework's practical utility for platform-level causal inference.
