Table of Contents
Fetching ...

Causal Intervention for Fairness in Multi-behavior Recommendation

Xi Wang, Wenjie Wang, Fuli Feng, Wenge Rong, Chuantao Yin, Zhang Xiong

TL;DR

The paper tackles popularity bias in multi-behavior recommender systems, which causes unfair exposure and quality-related disparities across items. It introduces a causal Multi-Behavior Debiasing (MBD) framework that blocks two backdoor paths by interventions on item exposure and item quality, leveraging post-click conversion ratio as a proxy for quality. The method jointly trains CTR and CVR predictors with backdoor-adjusted objectives and then ranks with a deconfounded score that omits popularity effects at inference. Experiments on Kwai and Tmall show that MBD improves both exposure and quality fairness while achieving state-of-the-art accuracy, demonstrating the practical value of causal interventions in multi-behavior recommendation.

Abstract

Recommender systems usually learn user interests from various user behaviors, including clicks and post-click behaviors (e.g., like and favorite). However, these behaviors inevitably exhibit popularity bias, leading to some unfairness issues: 1) for items with similar quality, more popular ones get more exposure; and 2) even worse the popular items with lower popularity might receive more exposure. Existing work on mitigating popularity bias blindly eliminates the bias and usually ignores the effect of item quality. We argue that the relationships between different user behaviors (e.g., conversion rate) actually reflect the item quality. Therefore, to handle the unfairness issues, we propose to mitigate the popularity bias by considering multiple user behaviors. In this work, we examine causal relationships behind the interaction generation procedure in multi-behavior recommendation. Specifically, we find that: 1) item popularity is a confounder between the exposed items and users' post-click interactions, leading to the first unfairness; and 2) some hidden confounders (e.g., the reputation of item producers) affect both item popularity and quality, resulting in the second unfairness. To alleviate these confounding issues, we propose a causal framework to estimate the causal effect, which leverages backdoor adjustment to block the backdoor paths caused by the confounders. In the inference stage, we remove the negative effect of popularity and utilize the good effect of quality for recommendation. Experiments on two real-world datasets validate the effectiveness of our proposed framework, which enhances fairness without sacrificing recommendation accuracy.

Causal Intervention for Fairness in Multi-behavior Recommendation

TL;DR

The paper tackles popularity bias in multi-behavior recommender systems, which causes unfair exposure and quality-related disparities across items. It introduces a causal Multi-Behavior Debiasing (MBD) framework that blocks two backdoor paths by interventions on item exposure and item quality, leveraging post-click conversion ratio as a proxy for quality. The method jointly trains CTR and CVR predictors with backdoor-adjusted objectives and then ranks with a deconfounded score that omits popularity effects at inference. Experiments on Kwai and Tmall show that MBD improves both exposure and quality fairness while achieving state-of-the-art accuracy, demonstrating the practical value of causal interventions in multi-behavior recommendation.

Abstract

Recommender systems usually learn user interests from various user behaviors, including clicks and post-click behaviors (e.g., like and favorite). However, these behaviors inevitably exhibit popularity bias, leading to some unfairness issues: 1) for items with similar quality, more popular ones get more exposure; and 2) even worse the popular items with lower popularity might receive more exposure. Existing work on mitigating popularity bias blindly eliminates the bias and usually ignores the effect of item quality. We argue that the relationships between different user behaviors (e.g., conversion rate) actually reflect the item quality. Therefore, to handle the unfairness issues, we propose to mitigate the popularity bias by considering multiple user behaviors. In this work, we examine causal relationships behind the interaction generation procedure in multi-behavior recommendation. Specifically, we find that: 1) item popularity is a confounder between the exposed items and users' post-click interactions, leading to the first unfairness; and 2) some hidden confounders (e.g., the reputation of item producers) affect both item popularity and quality, resulting in the second unfairness. To alleviate these confounding issues, we propose a causal framework to estimate the causal effect, which leverages backdoor adjustment to block the backdoor paths caused by the confounders. In the inference stage, we remove the negative effect of popularity and utilize the good effect of quality for recommendation. Experiments on two real-world datasets validate the effectiveness of our proposed framework, which enhances fairness without sacrificing recommendation accuracy.
Paper Structure (22 sections, 15 equations, 10 figures, 5 tables)

This paper contains 22 sections, 15 equations, 10 figures, 5 tables.

Figures (10)

  • Figure 1: Illustration of two kinds of unfairness. (a) Items' ranks and exposure w.r.t. increasing clicks (popularity) where the item quality is similar according to the same like/click ratio $[0.1, 0.14]$; (b) the ranks and exposure w.r.t. ascendant like/click ratios where items are selected by the same like number (i.e., 5). The results are obtained by training ESMM ma2018entire on the Kwai dataset zhang2021causal.
  • Figure 2: Causal graph for multi-behavior recommendation. (1) and (2) are two backdoor paths. The confounding effects of $Z$ from $I$ to $L$ and from $Q$ to $L$ are eliminated by the two interventions on $I$ and $Q$.
  • Figure 3: Collections of articles w.r.t. post-click conversion ratio from reads to likes.
  • Figure 4: The implementation of MBD. The CTR function and CVR function are the user-item matching functions $f_{c}$ and $f_{l}$, respectively. The purple arrows indicate that the item popularity $Z$ is used in the training stage but removed from the inference stage.
  • Figure 5: Distribution of top-$50$ recommended items w.r.t. ratio on the two datasets. The first sub-figure shows the distribution on Kwai and the second one shows that on Tmall.
  • ...and 5 more figures