RAGAR, Your Falsehood Radar: RAG-Augmented Reasoning for Political Fact-Checking using Multimodal Large Language Models
M. Abdul Khaliq, P. Chang, M. Ma, B. Pflugfelder, F. Miletić
TL;DR
This work tackles political misinformation, including multimodal claims, by introducing RAG-Augmented Reasoning (RAGAR) with two novel techniques: Chain of RAG (CoRAG) and Tree of RAG (ToRAG). The authors build a four-stage multimodal fact-checking pipeline that verbalizes claims with image context, retrieves multimodal evidence, and reasons over evidence using sequential (CoRAG) or branching (ToRAG) strategies, followed by veracity prediction and explanations. Evaluated on a PolitiFact-derived subset of the MOCHEG dataset, ToRAG with CoTVP+CoVe achieves a weighted F1 of $0.85$, outperforming baselines, and human annotations confirm high coverage of gold-standard explanations. The study demonstrates that incorporating multimodal evidence and structured RAG-based reasoning improves both veracity accuracy and explanation quality, while also highlighting limitations in dataset scope, retrieval determinism, and ethical deployment considerations.
Abstract
The escalating challenge of misinformation, particularly in political discourse, requires advanced fact-checking solutions; this is even clearer in the more complex scenario of multimodal claims. We tackle this issue using a multimodal large language model in conjunction with retrieval-augmented generation (RAG), and introduce two novel reasoning techniques: Chain of RAG (CoRAG) and Tree of RAG (ToRAG). They fact-check multimodal claims by extracting both textual and image content, retrieving external information, and reasoning subsequent questions to be answered based on prior evidence. We achieve a weighted F1-score of 0.85, surpassing a baseline reasoning technique by 0.14 points. Human evaluation confirms that the vast majority of our generated fact-check explanations contain all information from gold standard data.
