Reading and Reasoning over Chart Images for Evidence-based Automated Fact-Checking
Mubashara Akhtar, Oana Cocarascu, Elena Simperl
TL;DR
The paper defines chart-based fact-checking and presents ChartBERT, a reading-generation-embedding pipeline that fuses OCR-derived chart text with structural cues to verify claims against chart evidence. It introduces ChartFC, a 15,886-sample dataset derived from TabFact to benchmark chart-based evidence verification and systematically evaluates 75 vision-language baselines, with ChartBERT achieving 63.8% accuracy. The work demonstrates feasibility but also highlights substantial challenges in numerical reasoning and chart variability, underscoring the need for further multimodal and chart-specific reasoning research. Overall, it provides a new task, a first chart-focused AFC model, and a large benchmark to spur progress in evidence-based verification using chart imagery.
Abstract
Evidence data for automated fact-checking (AFC) can be in multiple modalities such as text, tables, images, audio, or video. While there is increasing interest in using images for AFC, previous works mostly focus on detecting manipulated or fake images. We propose a novel task, chart-based fact-checking, and introduce ChartBERT as the first model for AFC against chart evidence. ChartBERT leverages textual, structural and visual information of charts to determine the veracity of textual claims. For evaluation, we create ChartFC, a new dataset of 15, 886 charts. We systematically evaluate 75 different vision-language (VL) baselines and show that ChartBERT outperforms VL models, achieving 63.8% accuracy. Our results suggest that the task is complex yet feasible, with many challenges ahead.
