Is this chart lying to me? Automating the detection of misleading visualizations

Jonathan Tonglet; Jan Zimny; Tinne Tuytelaars; Iryna Gurevych

Is this chart lying to me? Automating the detection of misleading visualizations

Jonathan Tonglet, Jan Zimny, Tinne Tuytelaars, Iryna Gurevych

TL;DR

This work tackles the problem of misleading visualizations by introducing Misviz and Misviz-synth, two large open benchmarks for detecting misleaders. It develops a rule-based linter leveraging axis metadata, and image-axis classifiers that combine visual and axis information, evaluating them against state-of-the-art MLLMs. The results reveal a clear generalization gap: MLLMs excel on real-world data, while axis-aware methods dominate synthetic data, and axis extraction models trained on synthetic data struggle to generalize to real-world charts. The datasets and baselines enable targeted improvements for safeguarding readers and supporting chart designers, while highlighting future directions such as broader misleader taxonomies and improved axis extraction generalization.

Abstract

Misleading visualizations are a potent driver of misinformation on social media and the web. By violating chart design principles, they distort data and lead readers to draw inaccurate conclusions. Prior work has shown that both humans and multimodal large language models (MLLMs) are frequently deceived by such visualizations. Automatically detecting misleading visualizations and identifying the specific design rules they violate could help protect readers and reduce the spread of misinformation. However, the training and evaluation of AI models has been limited by the absence of large, diverse, and openly available datasets. In this work, we introduce Misviz, a benchmark of 2,604 real-world visualizations annotated with 12 types of misleaders. To support model training, we also create Misviz-synth, a synthetic dataset of 57,665 visualizations generated using Matplotlib and based on real-world data tables. We perform a comprehensive evaluation on both datasets using state-of-the-art MLLMs, rule-based systems, and image-axis classifiers. Our results reveal that the task remains highly challenging. We release Misviz, Misviz-synth, and the accompanying code.

Is this chart lying to me? Automating the detection of misleading visualizations

TL;DR

Abstract

Is this chart lying to me? Automating the detection of misleading visualizations

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (16)