EcoRank: Budget-Constrained Text Re-ranking Using Large Language Models
Muhammad Shihab Rashid, Jannat Ara Meem, Yue Dong, Vagelis Hristidis
TL;DR
This work tackles budget-constrained text re-ranking with large language models by formulating a constrained optimization problem over API choices, prompt designs, and budget splits. It introduces EcoRank, a two-layer cascading pipeline: a first stage with a high-accuracy, expensive LLM filters top passages using pointwise/binary prompts, followed by a second stage with a cheaper LLM performing pairwise ranking on the remaining items. Empirical results on four QA/passage datasets show EcoRank consistently outperforms budget-aware baselines and rivals zero-cost supervised baselines, demonstrating substantial gains in MR R and R@1 under realistic budgets. The approach provides a practical, scalable path for deploying LLM-based re-ranking in cost-constrained environments, with code and methodology open for further automation and extension.
Abstract
Large Language Models (LLMs) have achieved state-of-the-art performance in text re-ranking. This process includes queries and candidate passages in the prompts, utilizing pointwise, listwise, and pairwise prompting strategies. A limitation of these ranking strategies with LLMs is their cost: the process can become expensive due to API charges, which are based on the number of input and output tokens. We study how to maximize the re-ranking performance given a budget, by navigating the vast search spaces of prompt choices, LLM APIs, and budget splits. We propose a suite of budget-constrained methods to perform text re-ranking using a set of LLM APIs. Our most efficient method, called EcoRank, is a two-layered pipeline that jointly optimizes decisions regarding budget allocation across prompt strategies and LLM APIs. Our experimental results on four popular QA and passage reranking datasets show that EcoRank outperforms other budget-aware supervised and unsupervised baselines.
