Robust Neural Information Retrieval: An Adversarial and Out-of-distribution Perspective
Yu-An Liu, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Yixing Fan, Xueqi Cheng
TL;DR
This paper provides the first comprehensive survey of robustness in neural information retrieval, concentrating on adversarial and out-of-distribution (OOD) robustness. It defines robustness across three facets—IID stability, OOD generalization, and adversarial resilience—and organizes methods around dense retrieval (DRMs) and neural ranking models (NRMs). The survey catalogs datasets, evaluation metrics, and benchmark resources such as BestIR, and discusses open issues and future directions, including the impact of large language models (LLMs) on IR robustness. Overall, it offers a structured roadmap for developing robust, trustworthy IR systems in the face of adversaries, domain shifts, and evolving data landscapes. The work emphasizes practical considerations for deployment, benchmarking, and ongoing research in the LLM era to sustain robust retrieval and ranking performance.
Abstract
Recent advances in neural information retrieval (IR) models have significantly enhanced their effectiveness over various IR tasks. The robustness of these models, essential for ensuring their reliability in practice, has also garnered significant attention. With a wide array of research on robust IR being proposed, we believe it is the opportune moment to consolidate the current status, glean insights from existing methodologies, and lay the groundwork for future development. We view the robustness of IR to be a multifaceted concept, emphasizing its necessity against adversarial attacks, out-of-distribution (OOD) scenarios and performance variance. With a focus on adversarial and OOD robustness, we dissect robustness solutions for dense retrieval models (DRMs) and neural ranking models (NRMs), respectively, recognizing them as pivotal components of the neural IR pipeline. We provide an in-depth discussion of existing methods, datasets, and evaluation metrics, shedding light on challenges and future directions in the era of large language models. To the best of our knowledge, this is the first comprehensive survey on the robustness of neural IR models, and we will also be giving our first tutorial presentation at SIGIR 2024 \url{https://sigir2024-robust-information-retrieval.github.io}. Along with the organization of existing work, we introduce a Benchmark for robust IR (BestIR), a heterogeneous evaluation benchmark for robust neural information retrieval, which is publicly available at \url{https://github.com/Davion-Liu/BestIR}. We hope that this study provides useful clues for future research on the robustness of IR models and helps to develop trustworthy search engines \url{https://github.com/Davion-Liu/Awesome-Robustness-in-Information-Retrieval}.
