RARe: Retrieval Augmented Retrieval with In-Context Examples
Atula Tejaswi, Yoonsang Lee, Sujay Sanghavi, Eunsol Choi
TL;DR
This work investigates in-context learning for encoder-only text retrievers and introduces RARe, which augments the target query with semantically similar in-context exemplars retrieved via BM25. RARe is trained with standard contrastive loss, and its effectiveness is demonstrated across BeIR and the reasoning-oriented RAR-b benchmarks, with notable improvements in $nDCG@10$ and stronger out-of-domain generalization. The authors provide extensive analyses on exemplar quality, quantity, format, and content, showing that semantically relevant in-context examples yield robust gains and offering guidance for future design choices. The approach is validated across both decoder-based and retriever-based backbones, and code and checkpoints are released to facilitate adoption and further study.
Abstract
While in-context learning is well-studied with decoder-only language models (LLMs), its utility for encoder-only models remains underexplored. We study in-context learning for encoder-only models for text retrieval tasks. Can incorporating in-context examples (query-document pairs) to the target query enhance retriever performance? Our approach, RARe, finetunes a pre-trained model with in-context examples whose query is semantically similar to the target query. This approach achieves performance gains of up to +2.72% nDCG across open-domain retrieval datasets (BeIR, RAR-b) compared to using the target query only as an input. In particular, we find RARe exhibits stronger out-of-domain generalization compared to models using queries without in-context examples, similar to what is seen for in-context learning in LLMs. We further provide analysis on the design choices of in-context example augmentation for retrievers and lay the foundation for future work.
