ChartLens: Fine-grained Visual Attribution in Charts
Manan Suri, Puneet Mathur, Nedim Lipka, Franck Dernoncourt, Ryan A. Rossi, Dinesh Manocha
TL;DR
This work tackles the problem of hallucinations in chart-focused multimodal language models by introducing Post-Hoc Visual Attribution for Charts and ChartLens, a segmentation- and set-of-marks prompting-based grounding method. It formalizes the attribution objective with a mapping $f:(c,v)\mapsto \mathcal{A}_{c,v}$ and evaluation criteria of relevance, completeness, and precision, then presents ChartVA-Eval, a benchmark with 1200+ samples drawn from synthetic and real-world sources for fine-grained attribution assessment. ChartLens combines heuristic and SAM-based segmentation to produce robust visual marks, which are then used to ground model responses via SoM prompting and chain-of-thought validation, achieving 26-66% improvements over baselines on attribution accuracy. The approach is validated across bar, line, and pie charts, using real datasets such as MATSA, PlotQA, and ChartQA, and demonstrates practical significance for reliable chart interpretation in domains like finance and policy. The work sets a foundation for transparent, verifiable chart reasoning in critical applications and points to future integration with textual elements and broader visual data forms.
Abstract
The growing capabilities of multimodal large language models (MLLMs) have advanced tasks like chart understanding. However, these models often suffer from hallucinations, where generated text sequences conflict with the provided visual data. To address this, we introduce Post-Hoc Visual Attribution for Charts, which identifies fine-grained chart elements that validate a given chart-associated response. We propose ChartLens, a novel chart attribution algorithm that uses segmentation-based techniques to identify chart objects and employs set-of-marks prompting with MLLMs for fine-grained visual attribution. Additionally, we present ChartVA-Eval, a benchmark with synthetic and real-world charts from diverse domains like finance, policy, and economics, featuring fine-grained attribution annotations. Our evaluations show that ChartLens improves fine-grained attributions by 26-66%.
