Contextual Relevance and Adaptive Sampling for LLM-Based Document Reranking
Jerry Huang, Siddarth Madala, Cheng Niu, Julia Hockenmaier, Tong Zhang
TL;DR
The paper addresses the challenge that document relevance in LLM-based reranking is context-dependent, especially for reasoning-intensive queries. It defines contextual relevance as the probability a document is deemed relevant to a query across possible batching contexts and introduces TS-SetRank, a Bayesian, two-phase sampling algorithm that alternates uniform exploration with Thompson sampling to efficiently estimate this relevance. The authors provide theoretical guarantees (sublinear regret, posterior consistency, uniform exploration) and demonstrate empirically that modeling context improves nDCG@10 by substantial margins on BRIGHT and BEIR under fixed budgets. They also compare TS-SetRank to baselines like BM25 and Heapify, showing robust gains especially at smaller budgets and offering extensions to throughput-optimized variants for delayed feedback scenarios. The work advances practical, cost-efficient reranking by explicitly marginalizing over batching context, with implications for end-to-end retrieval and reasoning-focused information extraction.
Abstract
Reranking algorithms have made progress in improving document retrieval quality by efficiently aggregating relevance judgments generated by large language models (LLMs). However, identifying relevant documents for queries that require in-depth reasoning remains a major challenge. Reasoning-intensive queries often exhibit multifaceted information needs and nuanced interpretations, rendering document relevance inherently context dependent. To address this, we propose contextual relevance, which we define as the probability that a document is relevant to a given query, marginalized over the distribution of different reranking contexts it may appear in (i.e., the set of candidate documents it is ranked alongside and the order in which the documents are presented to a reranking model). While prior works have studied methods to mitigate the positional bias LLMs exhibit by accounting for the ordering of documents, we empirically find that the compositions of these batches also plays an important role in reranking performance. To efficiently estimate contextual relevance, we propose TS-SetRank, a sampling-based, uncertainty-aware reranking algorithm. Empirically, TS-SetRank improves nDCG@10 over retrieval and reranking baselines by 15-25% on BRIGHT and 6-21% on BEIR, highlighting the importance of modeling relevance as context-dependent.
