Can LLMs Improve Multimodal Fact-Checking by Asking Relevant Questions?
Alimohammad Beigi, Bohan Jiang, Dawei Li, Zhen Tan, Pouya Shaeri, Tharindu Kumarage, Amrita Bhattacharjee, Huan Liu
TL;DR
This work introduces Lrq-Fact, an LLM-driven framework that automatically generates two types of fact-checking questions—visual and textual—to guide evidence retrieval and verification in multimodal misinformation. By integrating image descriptions, VLM-based answers, retrieval-augmented generation for textual questions, and a rule-based decision-maker, Lrq-Fact improves fact-checking performance across three benchmark datasets and demonstrates adaptability across different LLM/VLM backbones. The study provides extensive analysis, including quality evaluation of generated FCQs, ablations of FCQ components, and case studies, showing significant gains over strong baselines. The work highlights the potential of FCQ-driven pipelines to scale and improve reliability in automated fact-checking, while noting limitations related to expert validation, random baselines, and efficiency, and it outlines paths for future refinement and ethical deployment.
Abstract
Traditional fact-checking relies on humans to formulate relevant and targeted fact-checking questions (FCQs), search for evidence, and verify the factuality of claims. While Large Language Models (LLMs) have been commonly used to automate evidence retrieval and factuality verification at scale, their effectiveness for fact-checking is hindered by the absence of FCQ formulation. To bridge this gap, we seek to answer two research questions: (1) Can LLMs generate relevant FCQs? (2) Can LLM-generated FCQs improve multimodal fact-checking? We therefore introduce a framework LRQ-FACT for using LLMs to generate relevant FCQs to facilitate evidence retrieval and enhance fact-checking by probing information across multiple modalities. Through extensive experiments, we verify if LRQ-FACT can generate relevant FCQs of different types and if LRQ-FACT can consistently outperform baseline methods in multimodal fact-checking. Further analysis illustrates how each component in LRQ-FACT works toward improving the fact-checking performance.
