Table of Contents
Fetching ...

GoT-CQA: Graph-of-Thought Guided Compositional Reasoning for Chart Question Answering

Lingling Zhang, Muye Huang, QianYing Wang, Yaxian Wang, Wenjun Wu, Jun Liu

TL;DR

This paper tackles chart question answering (CQA), which requires grounding in chart data and complex multi-step reasoning. It introduces GoT-CQA, a framework that converts a chart question into a directed acyclic Graph-of-Thought (GoT) composed of Localization, Numerical, and Logical operator nodes to guide an auto-generated compositional reasoning process over the chart data, culminating in a transformer-based answer decoder. The approach leverages a Donut-based chart encoder and converts questions into GoTs via templates or language model prompts, enabling structured, stepwise reasoning that generalizes across question types. Empirical results on ChartQA and PlotQA-D show GoT-CQA achieving strong performance, especially on challenging human-written and reasoning-heavy questions, with ablations confirming the value of GoT guidance and multi-operator reasoning for interpretability and accuracy.

Abstract

Chart Question Answering (CQA) aims at answering questions based on the visual chart content, which plays an important role in chart sumarization, business data analysis, and data report generation. CQA is a challenging multi-modal task because of the strong context dependence and complex reasoning requirement. The former refers to answering this question strictly based on the analysis of the visual content or internal data of the given chart, while the latter emphasizes the various logical and numerical reasoning involved in answer prediction process. In this paper, we pay more attention on the complex reasoning in CQA task, and propose a novel Graph-of-Thought (GoT) guided compositional reasoning model called GoT-CQA to overcome this problem. At first, we transform the chart-oriented question into a directed acyclic GoT composed of multiple operator nodes, including localization, numerical and logical operator. It intuitively reflects the human brain's solution process to this question. After that, we design an efficient auto-compositional reasoning framework guided by the GoT, to excute the multi-step reasoning operations in various types of questions. Comprehensive experiments on ChartQA and PlotQA-D datasets show that GoT-CQA achieves outstanding performance, especially in complex human-written and reasoning questions, comparing with the latest popular baselines.

GoT-CQA: Graph-of-Thought Guided Compositional Reasoning for Chart Question Answering

TL;DR

This paper tackles chart question answering (CQA), which requires grounding in chart data and complex multi-step reasoning. It introduces GoT-CQA, a framework that converts a chart question into a directed acyclic Graph-of-Thought (GoT) composed of Localization, Numerical, and Logical operator nodes to guide an auto-generated compositional reasoning process over the chart data, culminating in a transformer-based answer decoder. The approach leverages a Donut-based chart encoder and converts questions into GoTs via templates or language model prompts, enabling structured, stepwise reasoning that generalizes across question types. Empirical results on ChartQA and PlotQA-D show GoT-CQA achieving strong performance, especially on challenging human-written and reasoning-heavy questions, with ablations confirming the value of GoT guidance and multi-operator reasoning for interpretability and accuracy.

Abstract

Chart Question Answering (CQA) aims at answering questions based on the visual chart content, which plays an important role in chart sumarization, business data analysis, and data report generation. CQA is a challenging multi-modal task because of the strong context dependence and complex reasoning requirement. The former refers to answering this question strictly based on the analysis of the visual content or internal data of the given chart, while the latter emphasizes the various logical and numerical reasoning involved in answer prediction process. In this paper, we pay more attention on the complex reasoning in CQA task, and propose a novel Graph-of-Thought (GoT) guided compositional reasoning model called GoT-CQA to overcome this problem. At first, we transform the chart-oriented question into a directed acyclic GoT composed of multiple operator nodes, including localization, numerical and logical operator. It intuitively reflects the human brain's solution process to this question. After that, we design an efficient auto-compositional reasoning framework guided by the GoT, to excute the multi-step reasoning operations in various types of questions. Comprehensive experiments on ChartQA and PlotQA-D datasets show that GoT-CQA achieves outstanding performance, especially in complex human-written and reasoning questions, comparing with the latest popular baselines.
Paper Structure (12 sections, 2 equations, 4 figures, 5 tables)

This paper contains 12 sections, 2 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Examples of CQA and VQA task.
  • Figure 2: The overview of the proposed GoT-CQA framework.
  • Figure 3: Some GoT examples of questions.
  • Figure 4: Architecture for four types of reasoning blocks. Left: self-data reasoning; Right: Loc/Num/Log reasoning.