A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language Models

Shengyao Zhuang; Honglei Zhuang; Bevan Koopman; Guido Zuccon

A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language Models

Shengyao Zhuang, Honglei Zhuang, Bevan Koopman, Guido Zuccon

TL;DR

The paper addresses the inefficiency–effectiveness trade-offs in zero-shot LLM-based document ranking by systematically comparing Pointwise, Listwise, and Pairwise prompting under a unified framework. It introduces Setwise prompting, which selects the most relevant document from a set to accelerate multi-document comparisons and can leverage model logits for improved Listwise ranking. Empirical results across TREC DL and BEIR benchmarks show Setwise often achieves strong effectiveness while significantly reducing inference and token costs, particularly when paired with efficient sorting (e.g., heap sort). The work also analyzes trade-offs controlled by the number of documents compared per step and demonstrates robustness to initial ranking variations, with practical implications for deploying efficient LLM-based rerankers in real-world systems. The authors provide open-source code for reproducibility and outline directions for future exploration across additional LLMs and prompt-learning techniques.

Abstract

We propose a novel zero-shot document ranking approach based on Large Language Models (LLMs): the Setwise prompting approach. Our approach complements existing prompting approaches for LLM-based zero-shot ranking: Pointwise, Pairwise, and Listwise. Through the first-of-its-kind comparative evaluation within a consistent experimental framework and considering factors like model size, token consumption, latency, among others, we show that existing approaches are inherently characterised by trade-offs between effectiveness and efficiency. We find that while Pointwise approaches score high on efficiency, they suffer from poor effectiveness. Conversely, Pairwise approaches demonstrate superior effectiveness but incur high computational overhead. Our Setwise approach, instead, reduces the number of LLM inferences and the amount of prompt token consumption during the ranking procedure, compared to previous methods. This significantly improves the efficiency of LLM-based zero-shot ranking, while also retaining high zero-shot ranking effectiveness. We make our code and results publicly available at \url{https://github.com/ielab/llm-rankers}.

A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language Models

TL;DR

Abstract

Paper Structure (21 sections, 4 figures, 4 tables)

This paper contains 21 sections, 4 figures, 4 tables.

Introduction
Background & Related Work
Pointwise prompting approaches
Listwise prompting approaches
Pairwise prompting approaches
Other Directions in using LLMs for Ranking
Setwise Ranking Prompting
Limitations of Current Approaches
Speeding-up Pairwise with Setwise
Listwise Likelihoods with Setwise
Advantages of Setwise
Experiments
Datasets and evaluations
Implementation details
Results and Analysis
...and 6 more sections

Figures (4)

Figure 1: Different prompting strategies. (a) Pointwise, (b) Listwise, (c) Pairwise and (d) our proposed Setwise.
Figure 2: Illustration of the impact of Setwise Prompting vs. Pairwise Prompting on Sorting Algorithms. Nodes are documents, numbers in nodes represent the level of relevance assigned by the LLM (higher is more relevant).
Figure 3: Effectiveness and efficiency trade-offs offered by different approaches. (a -- Setwise): The numbers in the scatter plots represent the number of compared documents $c$ at each step of the sorting algorithm. (b -- Listwise) The numbers in the scatter plots represent the number of sliding windows repetitions $r$.
Figure 4: Sensitivity to the initial ranking. We use Flan-t5-large and $c=4$ for the Setwise approach.

A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language Models

TL;DR

Abstract

A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (4)