Table of Contents
Fetching ...

Beyond Reproducibility: Advancing Zero-shot LLM Reranking Efficiency with Setwise Insertion

Jakub Podolak, Leon Peric, Mina Janicijevic, Roxana Petcu

TL;DR

The paper addresses the efficiency and effectiveness of zero-shot document reranking with Large Language Models by reproducing Zhuang et al.'s Setwise method and introducing Setwise Insertion. It demonstrates that leveraging a prior initial ranking within Setwise prompts reduces redundant comparisons and stabilizes decisions, achieving a 31% faster query time and a 23% reduction in LLM inferences while slightly improving NDCG@10. Experiments across multiple architectures (Flan-T5, Vicuna, Llama) and datasets (TREC 2019/2020) confirm Setwise's superior trade-offs against Pointwise, Pairwise, and Listwise baselines, and show strong robustness in both encoder-decoder and decoder-only models. The work highlights practical gains for efficient, accurate zero-shot reranking and suggests future exploration on broader datasets and model families.

Abstract

This study presents a comprehensive reproducibility and extension analysis of the Setwise prompting methodology for zero-shot ranking with Large Language Models (LLMs), as proposed by Zhuang et al. We evaluate its effectiveness and efficiency compared to traditional Pointwise, Pairwise, and Listwise approaches in document ranking tasks. Our reproduction confirms the findings of Zhuang et al., highlighting the trade-offs between computational efficiency and ranking effectiveness in Setwise methods. Building on these insights, we introduce Setwise Insertion, a novel approach that leverages the initial document ranking as prior knowledge, reducing unnecessary comparisons and uncertainty by focusing on candidates more likely to improve the ranking results. Experimental results across multiple LLM architectures (Flan-T5, Vicuna, and Llama) show that Setwise Insertion yields a 31% reduction in query time, a 23% reduction in model inferences, and a slight improvement in reranking effectiveness compared to the original Setwise method. These findings highlight the practical advantage of incorporating prior ranking knowledge into Setwise prompting for efficient and accurate zero-shot document reranking.

Beyond Reproducibility: Advancing Zero-shot LLM Reranking Efficiency with Setwise Insertion

TL;DR

The paper addresses the efficiency and effectiveness of zero-shot document reranking with Large Language Models by reproducing Zhuang et al.'s Setwise method and introducing Setwise Insertion. It demonstrates that leveraging a prior initial ranking within Setwise prompts reduces redundant comparisons and stabilizes decisions, achieving a 31% faster query time and a 23% reduction in LLM inferences while slightly improving NDCG@10. Experiments across multiple architectures (Flan-T5, Vicuna, Llama) and datasets (TREC 2019/2020) confirm Setwise's superior trade-offs against Pointwise, Pairwise, and Listwise baselines, and show strong robustness in both encoder-decoder and decoder-only models. The work highlights practical gains for efficient, accurate zero-shot reranking and suggests future exploration on broader datasets and model families.

Abstract

This study presents a comprehensive reproducibility and extension analysis of the Setwise prompting methodology for zero-shot ranking with Large Language Models (LLMs), as proposed by Zhuang et al. We evaluate its effectiveness and efficiency compared to traditional Pointwise, Pairwise, and Listwise approaches in document ranking tasks. Our reproduction confirms the findings of Zhuang et al., highlighting the trade-offs between computational efficiency and ranking effectiveness in Setwise methods. Building on these insights, we introduce Setwise Insertion, a novel approach that leverages the initial document ranking as prior knowledge, reducing unnecessary comparisons and uncertainty by focusing on candidates more likely to improve the ranking results. Experimental results across multiple LLM architectures (Flan-T5, Vicuna, and Llama) show that Setwise Insertion yields a 31% reduction in query time, a 23% reduction in model inferences, and a slight improvement in reranking effectiveness compared to the original Setwise method. These findings highlight the practical advantage of incorporating prior ranking knowledge into Setwise prompting for efficient and accurate zero-shot document reranking.

Paper Structure

This paper contains 25 sections, 2 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Original Setwise prompt vs our proposed prompt with prior knowledge. We bias the model to return document "A" when uncertain. When constructing a prompt from the template, we put the document with the highest prior (e.g. highest score from BM25) as the document A.
  • Figure 2: Our proposed Setwise Insertion Sort method for efficient second-stage reranking of top-k documents using an LLM. At each step of the algorithm, we maintain only the top-k documents sorted. For each set of candidates, we check if any candidate is larger than the smallest in the top-k. If yes - we promote it to the top-k, and discard it otherwise.
  • Figure 3: NDCG@10 and the number of inferences per query after introducing our two proposed optimizations - insertion sort, and initial ranking prior. Results for all tested models and datasets.
  • Figure 4: Average NDCG@10 (3 runs) for setwise.heapsort (no prior) and setwise.insertion (with prior) across all tested models. The original author's method is in green (dashed line), and ours is in orange (dotted line). Separate results for TREC 2019 and TREC 2020.