THUIR at WSDM Cup 2023 Task 1: Unbiased Learning to Rank

Jia Chen; Haitao Li; Weihang Su; Qingyao Ai; Yiqun Liu

THUIR at WSDM Cup 2023 Task 1: Unbiased Learning to Rank

Jia Chen, Haitao Li, Weihang Su, Qingyao Ai, Yiqun Liu

TL;DR

This paper tackles unbiased learning-to-rank for large-scale web search-like data in the WSDM Cup 2023 Task 1 setting. It combines transformer pretraining on click data (with $L_{MLM}$ and $L_{CE}$ losses and a $40\%$ mask rate) and traditional IR signals (BM25) with a broad set of learning-to-rank features, then ensembles them via LightGBM. The findings show BM25 remains robust and proximity-based LTR features add value, while direct transformer re-ranking on test data may not consistently improve performance, highlighting the importance of robust feature engineering and careful training on biased signals. The approach achieves 2nd place on the final leaderboard, providing a practical blueprint for unbiased ranking in industry-scale systems with noisy feedback.

Abstract

This paper introduces the approaches we have used to participate in the WSDM Cup 2023 Task 1: Unbiased Learning to Rank. In brief, we have attempted a combination of both traditional IR models and transformer-based cross-encoder architectures. To further enhance the ranking performance, we also considered a series of features for learning to rank. As a result, we won 2nd place on the final leaderboard.

THUIR at WSDM Cup 2023 Task 1: Unbiased Learning to Rank

TL;DR

This paper tackles unbiased learning-to-rank for large-scale web search-like data in the WSDM Cup 2023 Task 1 setting. It combines transformer pretraining on click data (with

and

losses and a

mask rate) and traditional IR signals (BM25) with a broad set of learning-to-rank features, then ensembles them via LightGBM. The findings show BM25 remains robust and proximity-based LTR features add value, while direct transformer re-ranking on test data may not consistently improve performance, highlighting the importance of robust feature engineering and careful training on biased signals. The approach achieves 2nd place on the final leaderboard, providing a practical blueprint for unbiased ranking in industry-scale systems with noisy feedback.

Abstract

Paper Structure (9 sections, 1 equation, 2 tables)

This paper contains 9 sections, 1 equation, 2 tables.

Introduction
Methodology
Pre-training and fine-tuning the transformer with the large-scale click data
Traditional IR methods
Learning to rank features
Other implementation details
An overview of experimental results
Conclusion
Acknowledgements

THUIR at WSDM Cup 2023 Task 1: Unbiased Learning to Rank

TL;DR

Abstract

THUIR at WSDM Cup 2023 Task 1: Unbiased Learning to Rank

Authors

TL;DR

Abstract

Table of Contents