Table of Contents
Fetching ...

ChatRetriever: Adapting Large Language Models for Generalized and Robust Conversational Dense Retrieval

Kelong Mao, Chenlong Deng, Haonan Chen, Fengran Mo, Zheng Liu, Tetsuya Sakai, Zhicheng Dou

TL;DR

ChatRetriever tackles the problem of robust, generalized conversational dense retrieval by adapting large language models with a dual-learning CSIT framework. It combines contrastive session-level representation learning with session-masked instruction tuning to strengthen complex session encoding while preserving generalization. Empirical results across five benchmarks show state-of-the-art or near state-of-the-art performance, with notable robustness to context variations. The work demonstrates the potential of end-to-end LLM adaptation for retrieval tasks requiring nuanced multi-turn comprehension and sets a path for broader, instruction-followed IR applications.

Abstract

Conversational search requires accurate interpretation of user intent from complex multi-turn contexts. This paper presents ChatRetriever, which inherits the strong generalization capability of large language models to robustly represent complex conversational sessions for dense retrieval. To achieve this, we propose a simple and effective dual-learning approach that adapts LLM for retrieval via contrastive learning while enhancing the complex session understanding through masked instruction tuning on high-quality conversational instruction tuning data. Extensive experiments on five conversational search benchmarks demonstrate that ChatRetriever substantially outperforms existing conversational dense retrievers, achieving state-of-the-art performance on par with LLM-based rewriting approaches. Furthermore, ChatRetriever exhibits superior robustness in handling diverse conversational contexts. Our work highlights the potential of adapting LLMs for retrieval with complex inputs like conversational search sessions and proposes an effective approach to advance this research direction.

ChatRetriever: Adapting Large Language Models for Generalized and Robust Conversational Dense Retrieval

TL;DR

ChatRetriever tackles the problem of robust, generalized conversational dense retrieval by adapting large language models with a dual-learning CSIT framework. It combines contrastive session-level representation learning with session-masked instruction tuning to strengthen complex session encoding while preserving generalization. Empirical results across five benchmarks show state-of-the-art or near state-of-the-art performance, with notable robustness to context variations. The work demonstrates the potential of end-to-end LLM adaptation for retrieval tasks requiring nuanced multi-turn comprehension and sets a path for broader, instruction-followed IR applications.

Abstract

Conversational search requires accurate interpretation of user intent from complex multi-turn contexts. This paper presents ChatRetriever, which inherits the strong generalization capability of large language models to robustly represent complex conversational sessions for dense retrieval. To achieve this, we propose a simple and effective dual-learning approach that adapts LLM for retrieval via contrastive learning while enhancing the complex session understanding through masked instruction tuning on high-quality conversational instruction tuning data. Extensive experiments on five conversational search benchmarks demonstrate that ChatRetriever substantially outperforms existing conversational dense retrievers, achieving state-of-the-art performance on par with LLM-based rewriting approaches. Furthermore, ChatRetriever exhibits superior robustness in handling diverse conversational contexts. Our work highlights the potential of adapting LLMs for retrieval with complex inputs like conversational search sessions and proposes an effective approach to advance this research direction.
Paper Structure (19 sections, 4 equations, 7 figures, 7 tables)

This paper contains 19 sections, 4 equations, 7 figures, 7 tables.

Figures (7)

  • Figure 1: Illustration of adapting LLM for query rewriting and conversational dense retrieval.
  • Figure 2: Overview of CSIT. We fine-tune LLM to be ChatRetriever using dual learning objectives. We use the last special token (i.e., <EMB_3>) to represent the input text, which can be session or response. In the session-masked attention matrix, the blue squares denote the session or the response tokens while the green squares denote their special tokens.
  • Figure 3: Performance of ChatRetriever at different training steps.
  • Figure 4: The prompt to generate the response in the experiment of partial response modification.
  • Figure 5: The prompt to judge whether the current query is reasonable in the experiment of partial response modification.
  • ...and 2 more figures