Table of Contents
Fetching ...

LLM-based Listwise Reranking under the Effect of Positional Bias

Jingfen Qiao, Jin Huang, Xinyu Ma, Shuaiqiang Wang, Dawei Yin, Evangelos Kanoulas, Andrew Yates

Abstract

LLM-based listwise passage reranking has attracted attention for its effectiveness in ranking candidate passages. However, these models suffer from positional bias, where passages positioned towards the end of the input are less likely to be moved to top positions in the ranking. We hypothesize that there are two primary sources of positional bias: (1) architectural bias inherent in LLMs and (2) the imbalanced positioning of relevant documents. To address this, we propose DebiasFirst, a method that integrates positional calibration and position-aware data augmentation during fine-tuning. Positional calibration uses inverse propensity scoring to adjust for positional bias by re-weighting the contributions of different positions in the loss function when training. Position-aware augmentation augments training data to ensure that each passage appears equally across varied positions in the input list. This approach markedly enhances both effectiveness and robustness to the original ranking across diverse first-stage retrievers, reducing the dependence of NDCG@10 performance on the position of relevant documents. DebiasFirst also complements the inference-stage debiasing methods, offering a practical solution for mitigating positional bias in reranking.

LLM-based Listwise Reranking under the Effect of Positional Bias

Abstract

LLM-based listwise passage reranking has attracted attention for its effectiveness in ranking candidate passages. However, these models suffer from positional bias, where passages positioned towards the end of the input are less likely to be moved to top positions in the ranking. We hypothesize that there are two primary sources of positional bias: (1) architectural bias inherent in LLMs and (2) the imbalanced positioning of relevant documents. To address this, we propose DebiasFirst, a method that integrates positional calibration and position-aware data augmentation during fine-tuning. Positional calibration uses inverse propensity scoring to adjust for positional bias by re-weighting the contributions of different positions in the loss function when training. Position-aware augmentation augments training data to ensure that each passage appears equally across varied positions in the input list. This approach markedly enhances both effectiveness and robustness to the original ranking across diverse first-stage retrievers, reducing the dependence of NDCG@10 performance on the position of relevant documents. DebiasFirst also complements the inference-stage debiasing methods, offering a practical solution for mitigating positional bias in reranking.

Paper Structure

This paper contains 11 sections, 6 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: A causal directed acyclic graph illustrating the relationships between input (X), LLM, positional bias (P), and the resulting output ($\pi$) in listwise reranking.
  • Figure 2: Overview of the proposed positional calibration using IPS. Each document’s relevance score $f_\theta(x_i)$ is calibrated by multiplying it with estimated inverse propensity values $\pi_q(x_{i})$ to account for positional bias. The heatmap on the right visualizes the estimated inverse propensities across input and output positions.
  • Figure 3: Number of passages (z) with input positions (x) and true reranking positions (y).
  • Figure 4: Performance of LLM-based reranking methods when changing the position of the relevant passages within their input on MS MARCO (dev).
  • Figure 5: Evaluating the complementary effects of inference-Stage (PermSC approach tang2023found) and tuning-stage (our approach) debiasing in LLM-Based listwise passage reranking; Bars represent the performance on each shuffled ordering; Lines represent the aggregated performance using PermSC rank aggregation.
  • ...and 4 more figures