Can LLMs Produce Faithful Explanations For Fact-checking? Towards Faithful Explainable Fact-Checking via Multi-Agent Debate
Kyungha Kim, Sangyun Lee, Kung-Hsiang Huang, Hou Pong Chan, Manling Li, Heng Ji
TL;DR
The paper tackles the challenge of obtaining faithful, explainable explanations from LLMs in fact-checking, noting that zero-shot explanations often stray from evidence. It introduces the Multi-Agent Debate Refinement (MADR) framework, combining a formal error typology with enhanced prompts that include self-refinement and a debate-based error-detection mechanism across multiple agents. Empirical results show that MADR significantly improves faithfulness of explanations to the underlying evidence and reduces unfaithful elements. This approach enhances the credibility and practical utility of AI-provided explanations in fact-checking settings.
Abstract
Fact-checking research has extensively explored verification but less so the generation of natural-language explanations, crucial for user trust. While Large Language Models (LLMs) excel in text generation, their capability for producing faithful explanations in fact-checking remains underexamined. Our study investigates LLMs' ability to generate such explanations, finding that zero-shot prompts often result in unfaithfulness. To address these challenges, we propose the Multi-Agent Debate Refinement (MADR) framework, leveraging multiple LLMs as agents with diverse roles in an iterative refining process aimed at enhancing faithfulness in generated explanations. MADR ensures that the final explanation undergoes rigorous validation, significantly reducing the likelihood of unfaithful elements and aligning closely with the provided evidence. Experimental results demonstrate that MADR significantly improves the faithfulness of LLM-generated explanations to the evidence, advancing the credibility and trustworthiness of these explanations.
