Table of Contents
Fetching ...

Unbiased Learning to Rank with Query-Level Click Propensity Estimation: Beyond Pointwise Observation and Relevance

Lulu Yu, Keping Bi, Jiafeng Guo, Shihao Liu, Dawei Yin, Xueqi Cheng

TL;DR

This work addresses biases in unbiased learning-to-rank (ULTR) beyond the traditional position-examination assumption by introducing relevance saturation bias, where queries with more relevant results are more likely to attract clicks, potentially leaving relevant results unclicked. It proposes DualIPW, a dual inverse propensity weighting framework that combines query-level and position-level propensities, supported by a provable unbiased learning objective and a learned query-level propensity estimator based on log-ratio features and an LSTM. The approach yields significant improvements over strong ULTR baselines on real-world Baidu-ULTR data, with ablations confirming that both query- and position-level components are essential. The work advances practical ULTR by better accounting for real user behavior and logging biases, enabling more reliable ranking in production settings; code and data are publicly available for reproduction.

Abstract

Most existing unbiased learning-to-rank (ULTR) approaches are based on the user examination hypothesis, which assumes that users will click a result only if it is both relevant and observed (typically modeled by position). However, in real-world scenarios, users often click only one or two results after examining multiple relevant options, due to limited patience or because their information needs have already been satisfied. Motivated by this, we propose a query-level click propensity model to capture the probability that users will click on different result lists, allowing for non-zero probabilities that users may not click on an observed relevant result. We hypothesize that this propensity increases when more potentially relevant results are present, and refer to this user behavior as relevance saturation bias. Our method introduces a Dual Inverse Propensity Weighting (DualIPW) mechanism -- combining query-level and position-level IPW -- to address both relevance saturation and position bias. Through theoretical derivation, we prove that DualIPW can learn an unbiased ranking model. Experiments on the real-world Baidu-ULTR dataset demonstrate that our approach significantly outperforms state-of-the-art ULTR baselines. The code and dataset information can be found at https://github.com/Trustworthy-Information-Access/DualIPW.

Unbiased Learning to Rank with Query-Level Click Propensity Estimation: Beyond Pointwise Observation and Relevance

TL;DR

This work addresses biases in unbiased learning-to-rank (ULTR) beyond the traditional position-examination assumption by introducing relevance saturation bias, where queries with more relevant results are more likely to attract clicks, potentially leaving relevant results unclicked. It proposes DualIPW, a dual inverse propensity weighting framework that combines query-level and position-level propensities, supported by a provable unbiased learning objective and a learned query-level propensity estimator based on log-ratio features and an LSTM. The approach yields significant improvements over strong ULTR baselines on real-world Baidu-ULTR data, with ablations confirming that both query- and position-level components are essential. The work advances practical ULTR by better accounting for real user behavior and logging biases, enabling more reliable ranking in production settings; code and data are publicly available for reproduction.

Abstract

Most existing unbiased learning-to-rank (ULTR) approaches are based on the user examination hypothesis, which assumes that users will click a result only if it is both relevant and observed (typically modeled by position). However, in real-world scenarios, users often click only one or two results after examining multiple relevant options, due to limited patience or because their information needs have already been satisfied. Motivated by this, we propose a query-level click propensity model to capture the probability that users will click on different result lists, allowing for non-zero probabilities that users may not click on an observed relevant result. We hypothesize that this propensity increases when more potentially relevant results are present, and refer to this user behavior as relevance saturation bias. Our method introduces a Dual Inverse Propensity Weighting (DualIPW) mechanism -- combining query-level and position-level IPW -- to address both relevance saturation and position bias. Through theoretical derivation, we prove that DualIPW can learn an unbiased ranking model. Experiments on the real-world Baidu-ULTR dataset demonstrate that our approach significantly outperforms state-of-the-art ULTR baselines. The code and dataset information can be found at https://github.com/Trustworthy-Information-Access/DualIPW.

Paper Structure

This paper contains 21 sections, 10 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: (a) The distribution of single-click sessions at each position and performance of models trained by 10 single-click groups. (b) The distribution of maximal-score positions of single-click sessions at positions 1, 2, 3, and 4.
  • Figure 2: Query-level click propensity estimation.
  • Figure 3: Fine-grained analysis and click weights.