MetaSumPerceiver: Multimodal Multi-Document Evidence Summarization for Fact-Checking
Ting-Chih Chen, Chia-Wei Tang, Chris Thomas
TL;DR
The paper introduces MetaSumPerceiver (MSP), a multimodal, multi-document summarization framework designed to support fact-checking by producing concise, evidence-rich summaries from claims, documents, and images. MSP leverages a Perceiver-based architecture to handle arbitrary input lengths and modalities, and is trained with reinforcement learning using an entailment-based reward and KL regularization to generate summaries that facilitate truth assessment. A new Multi-News-Fact-Checking dataset is released, alongside extensive experiments on MOCHEG and this dataset showing state-of-the-art claim verification and explanation-generation performance, with ablations confirming the value of cross-modal evidence and critic-based guidance. The work demonstrates a promising direction for streamlining real-world fact-checking by producing targeted, verifiable summaries across heterogeneous sources.
Abstract
Fact-checking real-world claims often requires reviewing multiple multimodal documents to assess a claim's truthfulness, which is a highly laborious and time-consuming task. In this paper, we present a summarization model designed to generate claim-specific summaries useful for fact-checking from multimodal, multi-document datasets. The model takes inputs in the form of documents, images, and a claim, with the objective of assisting in fact-checking tasks. We introduce a dynamic perceiver-based model that can handle inputs from multiple modalities of arbitrary lengths. To train our model, we leverage a novel reinforcement learning-based entailment objective to generate summaries that provide evidence distinguishing between different truthfulness labels. To assess the efficacy of our approach, we conduct experiments on both an existing benchmark and a new dataset of multi-document claims that we contribute. Our approach outperforms the SOTA approach by 4.6% in the claim verification task on the MOCHEG dataset and demonstrates strong performance on our new Multi-News-Fact-Checking dataset.
