Table of Contents
Fetching ...

LLM4Ranking: An Easy-to-use Framework of Utilizing Large Language Models for Document Reranking

Qi Liu, Haozhe Duan, Yiqun Chen, Quanfeng Lu, Weiwei Sun, Jiaxin Mao

TL;DR

LLM4Ranking presents a modular, easy-to-use toolkit that unifies access to both open-source and API-based LLMs for document reranking. By decoupling LLM interfaces, ranking logic, and model implementations, it supports pointwise, pairwise, and listwise paradigms, along with generation, log-likelihood, and logits-based methods, and provides training and evaluation pipelines. The framework demonstrates zero-shot and supervised reranking experiments across multiple datasets, highlighting the practical effectiveness of API-backed models like RankGPT and TourRank, as well as the value of fine-tuning smaller models. This toolkit facilitates reproducible research and practical deployment for retrieval-augmented generation and other IR tasks, with broad applicability across academic and real-world settings.

Abstract

Utilizing large language models (LLMs) for document reranking has been a popular and promising research direction in recent years, many studies are dedicated to improving the performance and efficiency of using LLMs for reranking. Besides, it can also be applied in many real-world applications, such as search engines or retrieval-augmented generation. In response to the growing demand for research and application in practice, we introduce a unified framework, \textbf{LLM4Ranking}, which enables users to adopt different ranking methods using open-source or closed-source API-based LLMs. Our framework provides a simple and extensible interface for document reranking with LLMs, as well as easy-to-use evaluation and fine-tuning scripts for this task. We conducted experiments based on this framework and evaluated various models and methods on several widely used datasets, providing reproducibility results on utilizing LLMs for document reranking. Our code is publicly available at https://github.com/liuqi6777/llm4ranking.

LLM4Ranking: An Easy-to-use Framework of Utilizing Large Language Models for Document Reranking

TL;DR

LLM4Ranking presents a modular, easy-to-use toolkit that unifies access to both open-source and API-based LLMs for document reranking. By decoupling LLM interfaces, ranking logic, and model implementations, it supports pointwise, pairwise, and listwise paradigms, along with generation, log-likelihood, and logits-based methods, and provides training and evaluation pipelines. The framework demonstrates zero-shot and supervised reranking experiments across multiple datasets, highlighting the practical effectiveness of API-backed models like RankGPT and TourRank, as well as the value of fine-tuning smaller models. This toolkit facilitates reproducible research and practical deployment for retrieval-augmented generation and other IR tasks, with broad applicability across academic and real-world settings.

Abstract

Utilizing large language models (LLMs) for document reranking has been a popular and promising research direction in recent years, many studies are dedicated to improving the performance and efficiency of using LLMs for reranking. Besides, it can also be applied in many real-world applications, such as search engines or retrieval-augmented generation. In response to the growing demand for research and application in practice, we introduce a unified framework, \textbf{LLM4Ranking}, which enables users to adopt different ranking methods using open-source or closed-source API-based LLMs. Our framework provides a simple and extensible interface for document reranking with LLMs, as well as easy-to-use evaluation and fine-tuning scripts for this task. We conducted experiments based on this framework and evaluated various models and methods on several widely used datasets, providing reproducibility results on utilizing LLMs for document reranking. Our code is publicly available at https://github.com/liuqi6777/llm4ranking.

Paper Structure

This paper contains 25 sections, 1 figure, 3 tables.

Figures (1)

  • Figure 1: The overall framework of LLM4Ranking. The left part shows three core components: the backend of large language models, the ranker that holds the abstract ranking algorithm, and the specific model that used in the ranker. The right part shows the integrated features of the framework, including training and evaluation.