Reinforcement Retrieval Leveraging Fine-grained Feedback for Fact Checking News Claims with Black-Box LLM
Xuan Zhang, Wei Gao
TL;DR
This work tackles the difficulty of training retrieval models when using black-box LLMs for fact-checking by introducing Fine-grained Feedback with Reinforcement Retrieval (FFRR). FFRR uses a two-level reward framework—document-level and question-level—to guide a dense retrieval policy via policy gradient, incorporating intermediate LLM feedback to improve evidence selection. Across two public datasets, FFRR substantially outperforms strong LLM-enabled and non-LLM baselines, with the hybrid document-and-question approach yielding the best results. The approach enables effective exploration beyond top-$K$ retrieval while keeping inference overhead low, offering practical gains for real-world news verification tasks.
Abstract
Retrieval-augmented language models have exhibited promising performance across various areas of natural language processing (NLP), including fact-critical tasks. However, due to the black-box nature of advanced large language models (LLMs) and the non-retrieval-oriented supervision signal of specific tasks, the training of retrieval model faces significant challenges under the setting of black-box LLM. We propose an approach leveraging Fine-grained Feedback with Reinforcement Retrieval (FFRR) to enhance fact-checking on news claims by using black-box LLM. FFRR adopts a two-level strategy to gather fine-grained feedback from the LLM, which serves as a reward for optimizing the retrieval policy, by rating the retrieved documents based on the non-retrieval ground truth of the task. We evaluate our model on two public datasets for real-world news claim verification, and the results demonstrate that FFRR achieves significant improvements over strong LLM-enabled and non-LLM baselines.
