Table of Contents
Fetching ...

RAC: Efficient LLM Factuality Correction with Retrieval Augmentation

Changmao Li, Jeffrey Flanigan

TL;DR

A simple but effective low-latency post-correction method, Retrieval Augmented Correction (RAC), aimed at enhancing the factual performance of LLMs without requiring additional fine-tuning, and has greatly reduced latency compared to prior approaches.

Abstract

Large Language Models (LLMs) exhibit impressive results across a wide range of natural language processing (NLP) tasks, yet they can often produce factually incorrect outputs. This paper introduces a simple but effective low-latency post-correction method, \textbf{Retrieval Augmented Correction (RAC)}, aimed at enhancing the factual performance of LLMs without requiring additional fine-tuning. Our method is general and can be used with any instruction-tuned LLM, and has greatly reduced latency compared to prior approaches. RAC decomposes the LLM's output into atomic facts and applies a fine-grained verification and correction process with retrieved content to verify and correct the LLM-generated output. Our extensive experiments show that RAC yields up to 30\% improvements over state-of-the-art baselines across two popular factuality evaluation datasets, validating its efficacy and robustness in both with and without the integration of Retrieval-Augmented Generation (RAG) across different LLMs.\footnote{Our code is at \url{https://github.com/jlab-nlp/Retrieval-Augmented-Correction}}

RAC: Efficient LLM Factuality Correction with Retrieval Augmentation

TL;DR

A simple but effective low-latency post-correction method, Retrieval Augmented Correction (RAC), aimed at enhancing the factual performance of LLMs without requiring additional fine-tuning, and has greatly reduced latency compared to prior approaches.

Abstract

Large Language Models (LLMs) exhibit impressive results across a wide range of natural language processing (NLP) tasks, yet they can often produce factually incorrect outputs. This paper introduces a simple but effective low-latency post-correction method, \textbf{Retrieval Augmented Correction (RAC)}, aimed at enhancing the factual performance of LLMs without requiring additional fine-tuning. Our method is general and can be used with any instruction-tuned LLM, and has greatly reduced latency compared to prior approaches. RAC decomposes the LLM's output into atomic facts and applies a fine-grained verification and correction process with retrieved content to verify and correct the LLM-generated output. Our extensive experiments show that RAC yields up to 30\% improvements over state-of-the-art baselines across two popular factuality evaluation datasets, validating its efficacy and robustness in both with and without the integration of Retrieval-Augmented Generation (RAG) across different LLMs.\footnote{Our code is at \url{https://github.com/jlab-nlp/Retrieval-Augmented-Correction}}

Paper Structure

This paper contains 32 sections, 9 equations, 2 figures, 9 tables.

Figures (2)

  • Figure 1: Approach overview without using RAG. Note we do not use a verification stage (see Figure \ref{['fig:overview2']} below) when not using RAG, since we find that many sentences need to be corrected anyways.
  • Figure 2: Approach overview with RAG. NM means fact not mentioned in the retrieved documents. For LLM outputs with RAG, most content is correct; we only need to correct false ones.