Table of Contents
Fetching ...

Enhancing RAG with Active Learning on Conversation Records: Reject Incapables and Answer Capables

Xuzhao Geng, Haozhao Wang, Jun Wang, Wei Liu, Ruixuan Li

TL;DR

This work tackles persistent hallucinations in retrieval-augmented generation (RAG) by leveraging unlabeled conversation records through active learning to build a high-quality human-preference dataset. It introduces AL4RAG, a diversity-aware sampling framework tailored to the three-field structure of RAG data, and Retrieval-Augmented Similarity (RAS) to measure cross-sample distance more accurately. A novel preference dataset is constructed by labeling hallucinations and pairing model responses with explicit refusals, enabling Direct Preference Optimization (DPO) fine-tuning. Experiments on the RAGTruth dataset across multiple tasks show that AL4RAG and its Ras-enhanced variant consistently outperform baselines, especially under limited annotation budgets, demonstrating practical potential for safer and more reliable RAG systems.

Abstract

Retrieval-augmented generation (RAG) is a key technique for leveraging external knowledge and reducing hallucinations in large language models (LLMs). However, RAG still struggles to fully prevent hallucinated responses. To address this, it is essential to identify samples prone to hallucination or guide LLMs toward correct responses, which experts then annotate to develop high-quality datasets for refining LLMs. However, the growing scarcity of such datasets makes their creation challenging. This paper proposes using the vast amount of conversations from widespread LLM usage to build these datasets, training LLMs to avoid hallucination-prone questions while accurately responding to manageable ones. Given the impracticality of expert-annotating all conversation records, the paper introduces AL4RAG, which uses active learning to select the most suitable conversation samples for annotation, optimizing performance within an annotation budget. Additionally, recognizing that traditional active learning methods are not fully compatible with RAG due to unsuitable distance metrics, we develop a novel sample distance measurement for RAG active learning. Extensive experiments show that our method consistently outperforms baselines across multiple metrics.

Enhancing RAG with Active Learning on Conversation Records: Reject Incapables and Answer Capables

TL;DR

This work tackles persistent hallucinations in retrieval-augmented generation (RAG) by leveraging unlabeled conversation records through active learning to build a high-quality human-preference dataset. It introduces AL4RAG, a diversity-aware sampling framework tailored to the three-field structure of RAG data, and Retrieval-Augmented Similarity (RAS) to measure cross-sample distance more accurately. A novel preference dataset is constructed by labeling hallucinations and pairing model responses with explicit refusals, enabling Direct Preference Optimization (DPO) fine-tuning. Experiments on the RAGTruth dataset across multiple tasks show that AL4RAG and its Ras-enhanced variant consistently outperform baselines, especially under limited annotation budgets, demonstrating practical potential for safer and more reliable RAG systems.

Abstract

Retrieval-augmented generation (RAG) is a key technique for leveraging external knowledge and reducing hallucinations in large language models (LLMs). However, RAG still struggles to fully prevent hallucinated responses. To address this, it is essential to identify samples prone to hallucination or guide LLMs toward correct responses, which experts then annotate to develop high-quality datasets for refining LLMs. However, the growing scarcity of such datasets makes their creation challenging. This paper proposes using the vast amount of conversations from widespread LLM usage to build these datasets, training LLMs to avoid hallucination-prone questions while accurately responding to manageable ones. Given the impracticality of expert-annotating all conversation records, the paper introduces AL4RAG, which uses active learning to select the most suitable conversation samples for annotation, optimizing performance within an annotation budget. Additionally, recognizing that traditional active learning methods are not fully compatible with RAG due to unsuitable distance metrics, we develop a novel sample distance measurement for RAG active learning. Extensive experiments show that our method consistently outperforms baselines across multiple metrics.

Paper Structure

This paper contains 23 sections, 5 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: (a) An example regarding our tasks. In the scenario on the left, the original model provides an incorrect response to the user; in this case, we expect the model to decline answering the question, whereas in the scenario on the right, where the original model delivers a correct response, we aim for it to generate accurate responses more consistently. (b) Overall framework of our approach. The example of preference set construction is the same as (a).
  • Figure 2: Part of the experimental results of query similarity, prompt similarity and ras, which shows the impact of different similarity measurement on model performance. The left graph shows the performance of handling hallucination-prone queries, and the graph on the right shows the performance of handling model-answerable queries.
  • Figure 3: The performance of the original model, the full-data SFT model, and the full-data DPO-trained model, with the performance of the model trained via DPO with 25% data selected by our method as comparison. (a) Rejection rate of rejection performance; (b) Rouge-L of stability performance; (c) BERTScore of stability performance.