Table of Contents
Fetching ...

ChartQA-X: Generating Explanations for Visual Chart Reasoning

Shamanthak Hegde, Pooyan Fazli, Hasti Seifi

TL;DR

ChartQA-X provides the largest dataset for generating explanations alongside chart-based QA, enabling unified training of vision-language systems to produce both answers and grounded explanations. The authors implement a multi-stage pipeline: generate explanations with six VLMs, evaluate and select explanations via ROSCOE metrics, and verify correctness with cross-model validation. Human studies show ChartQA-X explanations rival or surpass human explanations in key dimensions, and fine-tuning on ChartQA-X yields large gains in explanation quality, QA accuracy, and generalization to unseen chart datasets. The work advances transparent chart reasoning and demonstrates practical benefits for data-driven decision support and trust in AI-enabled chart interpretation.

Abstract

The ability to explain complex information from chart images is vital for effective data-driven decision-making. In this work, we address the challenge of generating detailed explanations alongside answering questions about charts. We present ChartQA-X, a comprehensive dataset comprising 30,799 chart samples across four chart types, each paired with contextually relevant questions, answers, and explanations. Explanations are generated and selected based on metrics such as faithfulness, informativeness, coherence, and perplexity. Our human evaluation with 245 participants shows that model-generated explanations in ChartQA-X surpass human-written explanations in accuracy and logic and are comparable in terms of clarity and overall quality. Moreover, models fine-tuned on ChartQA-X show substantial improvements across various metrics, including absolute gains of up to 24.57 points in explanation quality, 18.96 percentage points in question-answering accuracy, and 14.75 percentage points on unseen benchmarks for the same task. By integrating explanatory narratives with answers, our approach enables agents to convey complex visual information more effectively, improving comprehension and greater trust in the generated responses.

ChartQA-X: Generating Explanations for Visual Chart Reasoning

TL;DR

ChartQA-X provides the largest dataset for generating explanations alongside chart-based QA, enabling unified training of vision-language systems to produce both answers and grounded explanations. The authors implement a multi-stage pipeline: generate explanations with six VLMs, evaluate and select explanations via ROSCOE metrics, and verify correctness with cross-model validation. Human studies show ChartQA-X explanations rival or surpass human explanations in key dimensions, and fine-tuning on ChartQA-X yields large gains in explanation quality, QA accuracy, and generalization to unseen chart datasets. The work advances transparent chart reasoning and demonstrates practical benefits for data-driven decision support and trust in AI-enabled chart interpretation.

Abstract

The ability to explain complex information from chart images is vital for effective data-driven decision-making. In this work, we address the challenge of generating detailed explanations alongside answering questions about charts. We present ChartQA-X, a comprehensive dataset comprising 30,799 chart samples across four chart types, each paired with contextually relevant questions, answers, and explanations. Explanations are generated and selected based on metrics such as faithfulness, informativeness, coherence, and perplexity. Our human evaluation with 245 participants shows that model-generated explanations in ChartQA-X surpass human-written explanations in accuracy and logic and are comparable in terms of clarity and overall quality. Moreover, models fine-tuned on ChartQA-X show substantial improvements across various metrics, including absolute gains of up to 24.57 points in explanation quality, 18.96 percentage points in question-answering accuracy, and 14.75 percentage points on unseen benchmarks for the same task. By integrating explanatory narratives with answers, our approach enables agents to convey complex visual information more effectively, improving comprehension and greater trust in the generated responses.

Paper Structure

This paper contains 20 sections, 1 equation, 6 figures, 13 tables.

Figures (6)

  • Figure 1: ChartQA-X dataset enables training VLMs capable of generating both answers and explanations in response to user questions about charts.
  • Figure 2: ChartQA-X is constructed in four stages: (1) preparing input data, (2) generating explanations using six VLMs, (3) selecting high-quality explanations based on ROSCOE scores, and (4) verifying explanation correctness. The dataset is evaluated through three types of experiments: (a) human-subject studies, (b) benchmark evaluations using accuracy and text generation metrics, and (c) generalizability tests on unseen datasets.
  • Figure 3: Distribution of (a) chart types, and (b) question types in the dataset.
  • Figure 4: Distribution of explanations in ChartQA-X including (a) percentage of explanations obtained from each VLM, (b) lengths of explanations, and (c) Box plot comparing Human and ChartQA-X performance across four evaluation metrics: Accuracy, Clarity, Logic, and Overall Quality. Each pair of boxes represents the distribution of scores for a specific metric, highlighting differences in performance and variance.
  • Figure 5: Distribution of length of explanations across different models.
  • ...and 1 more figures