Table of Contents
Fetching ...

Large Language Model Can Be a Foundation for Hidden Rationale-Based Retrieval

Luo Ji, Feixiang Guo, Teng Chen, Qingqing Gu, Xiaoyu Wang, Ningyuan Xi, Yihong Wang, Peng Yu, Yue Zhao, Hongyang Lei, Zhonglin Jiang, Yong Chen

TL;DR

The paper tackles hidden rationale retrieval, a challenging retrieval task where queries and documents are not semantically similar and require reasoning. It introduces LaHoRe, an instruction-tuned LLM-based cross-encoder that converts retrieval into a generative task using a binary-choice prompt and scores relevance via next-token probabilities, with efficiency improvements from prefix decoding and caching. Through zero-shot and fine-tuned experiments on Emotional Support Conversation datasets, LaHoRe outperforms baselines, with DPO often surpassing SFT and strong end-to-end RAG performance against raw LLM generation. The work demonstrates that LLMs can serve as a foundational tool for broader, reasoning-based retrieval tasks and provides a scalable, efficient framework with open-source resources for further research and application.

Abstract

Despite the recent advancement in Retrieval-Augmented Generation (RAG) systems, most retrieval methodologies are often developed for factual retrieval, which assumes query and positive documents are semantically similar. In this paper, we instead propose and study a more challenging type of retrieval task, called hidden rationale retrieval, in which query and document are not similar but can be inferred by reasoning chains, logic relationships, or empirical experiences. To address such problems, an instruction-tuned Large language model (LLM) with a cross-encoder architecture could be a reasonable choice. To further strengthen pioneering LLM-based retrievers, we design a special instruction that transforms the retrieval task into a generative task by prompting LLM to answer a binary-choice question. The model can be fine-tuned with direct preference optimization (DPO). The framework is also optimized for computational efficiency with no performance degradation. We name this retrieval framework by RaHoRe and verify its zero-shot and fine-tuned performance superiority on Emotional Support Conversation (ESC), compared with previous retrieval works. Our study suggests the potential to employ LLM as a foundation for a wider scope of retrieval tasks. Our codes, models, and datasets are available on https://github.com/flyfree5/LaHoRe.

Large Language Model Can Be a Foundation for Hidden Rationale-Based Retrieval

TL;DR

The paper tackles hidden rationale retrieval, a challenging retrieval task where queries and documents are not semantically similar and require reasoning. It introduces LaHoRe, an instruction-tuned LLM-based cross-encoder that converts retrieval into a generative task using a binary-choice prompt and scores relevance via next-token probabilities, with efficiency improvements from prefix decoding and caching. Through zero-shot and fine-tuned experiments on Emotional Support Conversation datasets, LaHoRe outperforms baselines, with DPO often surpassing SFT and strong end-to-end RAG performance against raw LLM generation. The work demonstrates that LLMs can serve as a foundational tool for broader, reasoning-based retrieval tasks and provides a scalable, efficient framework with open-source resources for further research and application.

Abstract

Despite the recent advancement in Retrieval-Augmented Generation (RAG) systems, most retrieval methodologies are often developed for factual retrieval, which assumes query and positive documents are semantically similar. In this paper, we instead propose and study a more challenging type of retrieval task, called hidden rationale retrieval, in which query and document are not similar but can be inferred by reasoning chains, logic relationships, or empirical experiences. To address such problems, an instruction-tuned Large language model (LLM) with a cross-encoder architecture could be a reasonable choice. To further strengthen pioneering LLM-based retrievers, we design a special instruction that transforms the retrieval task into a generative task by prompting LLM to answer a binary-choice question. The model can be fine-tuned with direct preference optimization (DPO). The framework is also optimized for computational efficiency with no performance degradation. We name this retrieval framework by RaHoRe and verify its zero-shot and fine-tuned performance superiority on Emotional Support Conversation (ESC), compared with previous retrieval works. Our study suggests the potential to employ LLM as a foundation for a wider scope of retrieval tasks. Our codes, models, and datasets are available on https://github.com/flyfree5/LaHoRe.

Paper Structure

This paper contains 10 sections, 3 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Comparison of retrieval paradigms. Conventional fact-based retrieval tasks (Left) can be studied by query-document semantic similarity, while rationale-based retrieval tasks (Right) such as reply strategy or embodied subtask retrieval can not.
  • Figure 2: Architectural Comparison of LaHoRe with previous LLM-based retrieval methods. C, G, sim denote contrastive loss, generative loss, similarity function, respectively. (A): Bi-encoder; (B): Cross-encoder; (C): Generative RAG framework; (D): Framework of LaHoRe. We switch the order of Q and D such that D's encoding can be cached.
  • Figure 3: Retrieval performance plots with different positive-negative ratios.