Generative Relevance Feedback and Convergence of Adaptive Re-Ranking: University of Glasgow Terrier Team at TREC DL 2023
Andrew Parry, Thomas Jaenich, Sean MacAvaney, Iadh Ounis
TL;DR
The paper addresses scalable retrieval enhancements by combining generative relevance feedback (Gen-QR and Gen-PRF) with graph-based adaptive re-ranking (GAR) in the context of TREC DL 2023. It evaluates zero-shot Gen-QR/Gen-PRF over BM25 and SPLADE first-stage retrievers and applies GAR on a BM25 corpus graph G = (V,E) with budget 5000 and 32 nearest neighbours, scored by a cross-encoder S such as monoELECTRA. Key findings show that Gen-PRF with GAR yields the strongest P@10 and nDCG@10, though SPLADE often achieves higher recall and MAP; with large budgets, a lexical first-stage model can approximate the performance of a learned retriever, as evidenced by increasing RBO correlations up to around 0.80. The results demonstrate the generalizability of zero-shot generative expansions to new test sets and reveal graph-based re-ranking as a viable pathway to reduce reliance on expensive first-stage models for practical, scalable retrieval systems.
Abstract
This paper describes our participation in the TREC 2023 Deep Learning Track. We submitted runs that apply generative relevance feedback from a large language model in both a zero-shot and pseudo-relevance feedback setting over two sparse retrieval approaches, namely BM25 and SPLADE. We couple this first stage with adaptive re-ranking over a BM25 corpus graph scored using a monoELECTRA cross-encoder. We investigate the efficacy of these generative approaches for different query types in first-stage retrieval. In re-ranking, we investigate operating points of adaptive re-ranking with different first stages to find the point in graph traversal where the first stage no longer has an effect on the performance of the overall retrieval pipeline. We find some performance gains from the application of generative query reformulation. However, our strongest run in terms of P@10 and nDCG@10 applied both adaptive re-ranking and generative pseudo-relevance feedback, namely uogtr_b_grf_e_gb.
