Table of Contents
Fetching ...

InfoRank: Unbiased Learning-to-Rank via Conditional Mutual Information Minimization

Jiarui Jin, Zexue He, Mengyue Yang, Weinan Zhang, Yong Yu, Jun Wang, Julian McAuley

TL;DR

InfoRank tackles bias in implicit feedback during ranking by unifying position and popularity biases into an observation factor and enforcing conditional independence between relevance and observation estimations. It leverages a multi-head attention-based estimator to model $P(R=1|O,X)$ and $P(O=1|X)$ and introduces a conditional mutual information regularizer $\\mathcal{I}(R;O|X)$ to promote unbiased relevance. The model optimizes a joint objective combining a supervised loss on clicks with the MI regularization, and its two-tower design remains effective even with estimated observations. Empirical results on Yahoo, LETOR, and Adressa show that InfoRank achieves superior unbiased ranking and robust debiasing across diverse user browsing patterns.

Abstract

Ranking items regarding individual user interests is a core technique of multiple downstream tasks such as recommender systems. Learning such a personalized ranker typically relies on the implicit feedback from users' past click-through behaviors. However, collected feedback is biased toward previously highly-ranked items and directly learning from it would result in a "rich-get-richer" phenomenon. In this paper, we propose a simple yet sufficient unbiased learning-to-rank paradigm named InfoRank that aims to simultaneously address both position and popularity biases. We begin by consolidating the impacts of those biases into a single observation factor, thereby providing a unified approach to addressing bias-related issues. Subsequently, we minimize the mutual information between the observation estimation and the relevance estimation conditioned on the input features. By doing so, our relevance estimation can be proved to be free of bias. To implement InfoRank, we first incorporate an attention mechanism to capture latent correlations within user-item features, thereby generating estimations of observation and relevance. We then introduce a regularization term, grounded in conditional mutual information, to promote conditional independence between relevance estimation and observation estimation. Experimental evaluations conducted across three extensive recommendation and search datasets reveal that InfoRank learns more precise and unbiased ranking strategies.

InfoRank: Unbiased Learning-to-Rank via Conditional Mutual Information Minimization

TL;DR

InfoRank tackles bias in implicit feedback during ranking by unifying position and popularity biases into an observation factor and enforcing conditional independence between relevance and observation estimations. It leverages a multi-head attention-based estimator to model and and introduces a conditional mutual information regularizer to promote unbiased relevance. The model optimizes a joint objective combining a supervised loss on clicks with the MI regularization, and its two-tower design remains effective even with estimated observations. Empirical results on Yahoo, LETOR, and Adressa show that InfoRank achieves superior unbiased ranking and robust debiasing across diverse user browsing patterns.

Abstract

Ranking items regarding individual user interests is a core technique of multiple downstream tasks such as recommender systems. Learning such a personalized ranker typically relies on the implicit feedback from users' past click-through behaviors. However, collected feedback is biased toward previously highly-ranked items and directly learning from it would result in a "rich-get-richer" phenomenon. In this paper, we propose a simple yet sufficient unbiased learning-to-rank paradigm named InfoRank that aims to simultaneously address both position and popularity biases. We begin by consolidating the impacts of those biases into a single observation factor, thereby providing a unified approach to addressing bias-related issues. Subsequently, we minimize the mutual information between the observation estimation and the relevance estimation conditioned on the input features. By doing so, our relevance estimation can be proved to be free of bias. To implement InfoRank, we first incorporate an attention mechanism to capture latent correlations within user-item features, thereby generating estimations of observation and relevance. We then introduce a regularization term, grounded in conditional mutual information, to promote conditional independence between relevance estimation and observation estimation. Experimental evaluations conducted across three extensive recommendation and search datasets reveal that InfoRank learns more precise and unbiased ranking strategies.
Paper Structure (31 sections, 4 theorems, 22 equations, 8 figures, 3 tables, 1 algorithm)

This paper contains 31 sections, 4 theorems, 22 equations, 8 figures, 3 tables, 1 algorithm.

Key Result

proposition 1

Given that relevance, click, and observation variables are binary (i.e., $R, C, O\in\{1,0\}$), for any user-item pair with feature $X$, the following statements are equivalent:

Figures (8)

  • Figure 1: An illustrated example of the feedback loop, position bias, and popularity bias in learning-to-rank. Within this process, the ranking system blends user and item features (c) with implicit feedback to generate the final ranking list. However, this system is susceptible to both position bias and popularity bias (b). Furthermore, these biases tend to be amplified within the feedback loop (a), potentially resulting in a "rich-get-richer" dilemma.
  • Figure 2: The overall architecture of InfoRank, where (a) we first leverage an attention mechanism to mine correlations between user-item features (Section \ref{['subsec:causal']}); and (b) we then introduce a regularization formulation (i.e., $\mathcal{I}$) aimed at establishing conditional mutual information to ensure that relevance becomes conditionally independent of the observation factor (Section \ref{['subsec:counterfactual']}). To capture relevance within biased feedback, we incorporate this regularization term with supervision (i.e., $\mathcal{L}$) over user behaviors (Section \ref{['subsec:optimize']}). We note that InfoRank remains working even in scenarios where there is no observation information available within user browsing logs. In such cases, we substitute real observations with estimated ones.
  • Figure 3: Average positions after re-ranking of items at each normalized frequency (in the left subfigure); or at each original position (in the right subfigure) by different debiasing methods together with InfoRank and InfoRank$^-$ on Yahoo.
  • Figure 4: Comparison of InfoRank and InfoRank$^-$ under different click generation models and datasets in terms of the $\Delta$CI metric.
  • Figure 5: Left: Average frequency of different item groups recommended by InfoRank (Ranking) incorporated with InfoRank (Debiasing) and Click Data on Adressa. Right: Performance change of InfoRank with different regularization weight $\eta$ on Yahoo.
  • ...and 3 more figures

Theorems & Definitions (4)

  • proposition 1
  • proposition 2
  • proposition 3
  • proposition 4