Be Aware of the Neighborhood Effect: Modeling Selection Bias under Interference

Haoxuan Li; Chunyuan Zheng; Sihao Ding; Peng Wu; Zhi Geng; Fuli Feng; Xiangnan He

Be Aware of the Neighborhood Effect: Modeling Selection Bias under Interference

Haoxuan Li, Chunyuan Zheng, Sihao Ding, Peng Wu, Zhi Geng, Fuli Feng, Xiangnan He

TL;DR

This work addresses selection bias in recommender systems under neighborhood interference by formulating the problem as causal learning with interference. It introduces a learnable neighborhood treatment representation $g$ and a neighborhood-aware ideal loss $L_{ideal}^{N}$, then develops two unbiased estimators, neighborhood IPS (N-IPS) and neighborhood DR (N-DR), based on kernel smoothing to handle continuous $g$. The authors prove identifiability, derive bias-variance and optimal bandwidth results, and provide generalization bounds, demonstrating that N-IPS/N-DR can achieve unbiased learning when both selection bias and neighborhood effect are present. Empirical evaluation on semi-synthetic MovieLens data and real-world datasets (Coat, Yahoo! R3, KuaiRec) shows substantial improvements over traditional IPS/DR methods, highlighting the practical significance of accounting for neighborhood interference in debiasing recommender systems.

Abstract

Selection bias in recommender system arises from the recommendation process of system filtering and the interactive process of user selection. Many previous studies have focused on addressing selection bias to achieve unbiased learning of the prediction model, but ignore the fact that potential outcomes for a given user-item pair may vary with the treatments assigned to other user-item pairs, named neighborhood effect. To fill the gap, this paper formally formulates the neighborhood effect as an interference problem from the perspective of causal inference and introduces a treatment representation to capture the neighborhood effect. On this basis, we propose a novel ideal loss that can be used to deal with selection bias in the presence of neighborhood effect. We further develop two new estimators for estimating the proposed ideal loss. We theoretically establish the connection between the proposed and previous debiasing methods ignoring the neighborhood effect, showing that the proposed methods can achieve unbiased learning when both selection bias and neighborhood effect are present, while the existing methods are biased. Extensive semi-synthetic and real-world experiments are conducted to demonstrate the effectiveness of the proposed methods.

Be Aware of the Neighborhood Effect: Modeling Selection Bias under Interference

TL;DR

and a neighborhood-aware ideal loss

, then develops two unbiased estimators, neighborhood IPS (N-IPS) and neighborhood DR (N-DR), based on kernel smoothing to handle continuous

. The authors prove identifiability, derive bias-variance and optimal bandwidth results, and provide generalization bounds, demonstrating that N-IPS/N-DR can achieve unbiased learning when both selection bias and neighborhood effect are present. Empirical evaluation on semi-synthetic MovieLens data and real-world datasets (Coat, Yahoo! R3, KuaiRec) shows substantial improvements over traditional IPS/DR methods, highlighting the practical significance of accounting for neighborhood interference in debiasing recommender systems.

Abstract

Paper Structure (24 sections, 18 theorems, 107 equations, 2 figures, 2 tables, 4 algorithms)

This paper contains 24 sections, 18 theorems, 107 equations, 2 figures, 2 tables, 4 algorithms.

Introduction
Preliminaries: Previous Selection Bias Formulation
Modeling Selection Bias under Neighborhood Effect
Beyond “No Interference” Assumption in Previous Studies
Proposed Causal Parameter of Interest under Interference
Unbiased Estimation and Learning under Interference
Effect of Ignoring Interference
Proposed Unbiased Estimators
Propensity Estimation Method
Further Theoretical Analysis
Semi-synthetic Experiments
Real-World Experiments
Conclusion
Acknowledgement
Related Work
...and 9 more sections

Key Result

Theorem 1

Under Assumptions assumption_1--assumption_3, $\mathcal{L}_{\mathrm{ideal}}^\mathrm{N}(\hat{{\bf R}} | \boldsymbol{g})$ and $\mathcal{L}_{\mathrm{ideal}}^\mathrm{N}(\hat{{\bf R}})$ are identifiable.

Figures (2)

Figure 1: Causal diagrams of the existing debiasing methods under no interference assumption (left), and the proposed method taking into account the presence of interference (right), where $x_{u, i}$, $o_{u, i}$, and $r_{u, i}$ denote the confounder, treatment, and outcome of user-item pair $(u, i)$, respectively. In the presence of interference, $\mathcal{N}_{(u, i)}$ and $\mathcal{N}_{-(u, i)}$ denote the other user-item pairs affecting and not affecting $(u, i)$, respectively, and $\boldsymbol{g}_{u,i}$ denotes the treatment representation to capture the interference.
Figure 2: The effect of mask numbers as interference strength on RE on six prediction matrices.

Theorems & Definitions (31)

Theorem 1: Identifiability
Theorem 2: Link to Selection Bias
Theorem 3: Bias and Variance of N-IPS and N-DR
Theorem 4: Optimal Bandwidth of N-IPS and N-DR
Theorem 5: Generalization Error Bounds of N-IPS and N-DR
Theorem 1: Identifiability
proof : Proof of Theorem 1
Theorem 2: Link to Selection Bias
proof : Proof of Theorem 2
Theorem 3: Bias and Variance of N-IPS and N-DR
...and 21 more

Be Aware of the Neighborhood Effect: Modeling Selection Bias under Interference

TL;DR

Abstract

Be Aware of the Neighborhood Effect: Modeling Selection Bias under Interference

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (31)