Table of Contents
Fetching ...

CIR at the NTCIR-17 ULTRE-2 Task

Lulu Yu, Keping Bi, Jiafeng Guo, Xueqi Cheng

TL;DR

The paper tackles biased learning to rank in real-world Baidu click data, where false negatives are a major challenge. It extends the Dual Learning Algorithm (DLA) with two strategies: label correction for non-clicked items (DLA-LC) and negative sampling to inject true and hard negatives, plus a LightGBM LambdaRank upper bound trained on human annotations. The best performing method, Scratch-DLA-LC (sig), achieves ndcg@10 of 0.5355, outperforming the organizer's best by about 2.66% and demonstrating the effectiveness of addressing false negatives beyond traditional position bias. The work provides practical guidance for ULTR in real-world datasets and suggests further gains by integrating pre-trained scores in future iterations.

Abstract

The Chinese academy of sciences Information Retrieval team (CIR) has participated in the NTCIR-17 ULTRE-2 task. This paper describes our approaches and reports our results on the ULTRE-2 task. We recognize the issue of false negatives in the Baidu search data in this competition is very severe, much more severe than position bias. Hence, we adopt the Dual Learning Algorithm (DLA) to address the position bias and use it as an auxiliary model to study how to alleviate the false negative issue. We approach the problem from two perspectives: 1) correcting the labels for non-clicked items by a relevance judgment model trained from DLA, and learn a new ranker that is initialized from DLA; 2) including random documents as true negatives and documents that have partial matching as hard negatives. Both methods can enhance the model performance and our best method has achieved nDCG@10 of 0.5355, which is 2.66% better than the best score from the organizer.

CIR at the NTCIR-17 ULTRE-2 Task

TL;DR

The paper tackles biased learning to rank in real-world Baidu click data, where false negatives are a major challenge. It extends the Dual Learning Algorithm (DLA) with two strategies: label correction for non-clicked items (DLA-LC) and negative sampling to inject true and hard negatives, plus a LightGBM LambdaRank upper bound trained on human annotations. The best performing method, Scratch-DLA-LC (sig), achieves ndcg@10 of 0.5355, outperforming the organizer's best by about 2.66% and demonstrating the effectiveness of addressing false negatives beyond traditional position bias. The work provides practical guidance for ULTR in real-world datasets and suggests further gains by integrating pre-trained scores in future iterations.

Abstract

The Chinese academy of sciences Information Retrieval team (CIR) has participated in the NTCIR-17 ULTRE-2 task. This paper describes our approaches and reports our results on the ULTRE-2 task. We recognize the issue of false negatives in the Baidu search data in this competition is very severe, much more severe than position bias. Hence, we adopt the Dual Learning Algorithm (DLA) to address the position bias and use it as an auxiliary model to study how to alleviate the false negative issue. We approach the problem from two perspectives: 1) correcting the labels for non-clicked items by a relevance judgment model trained from DLA, and learn a new ranker that is initialized from DLA; 2) including random documents as true negatives and documents that have partial matching as hard negatives. Both methods can enhance the model performance and our best method has achieved nDCG@10 of 0.5355, which is 2.66% better than the best score from the organizer.
Paper Structure (13 sections, 6 equations, 1 figure, 2 tables)

This paper contains 13 sections, 6 equations, 1 figure, 2 tables.

Figures (1)

  • Figure 1: Performance curves of two schemes ("click-only" and "last-click") w.r.t. the number of random and hard negatives. (a) Performance curves of two schemes w.r.t. the number of random negatives. (b) The Performance curve of the "click-only" scheme w.r.t. the number of hard negatives.