Table of Contents
Fetching ...

ChartGalaxy: A Dataset for Infographic Chart Understanding and Generation

Zhen Li, Duan Li, Yukai Guo, Xinyuan Guo, Bowen Li, Lanxi Xiao, Shenyu Qiao, Jiashu Chen, Zijian Wu, Hui Zhang, Xinhuan Shu, Shixia Liu

TL;DR

ChartGalaxy tackles the challenge of inferring and generating infographic charts with large vision-language models by constructing a million-scale dataset of real and synthetic infographics, each paired with tabular data. It combines real-world templates and patterns with a human-in-the-loop synthesis pipeline to produce 1.7 million synthetic and 61k real charts across 75 types, 440 variations, and 68 layouts. The authors demonstrate the dataset’s value through three applications: infographic chart understanding via VQA, executable chart code generation benchmarks, and example-based infographic generation, reporting notable performance gains and qualitative improvements. This resource advances multimodal reasoning and content generation for infographics and provides a scalable foundation for LVLM training and evaluation, with potential impact on media, education, and data storytelling.

Abstract

Infographic charts are a powerful medium for communicating abstract data by combining visual elements (e.g., charts, images) with textual information. However, their visual and structural richness poses challenges for large vision-language models (LVLMs), which are typically trained on plain charts. To bridge this gap, we introduce ChartGalaxy, a million-scale dataset designed to advance the understanding and generation of infographic charts. The dataset is constructed through an inductive process that identifies 75 chart types, 440 chart variations, and 68 layout templates from real infographic charts and uses them to create synthetic ones programmatically. We showcase the utility of this dataset through: 1) improving infographic chart understanding via fine-tuning, 2) benchmarking code generation for infographic charts, and 3) enabling example-based infographic chart generation. By capturing the visual and structural complexity of real design, ChartGalaxy provides a useful resource for enhancing multimodal reasoning and generation in LVLMs.

ChartGalaxy: A Dataset for Infographic Chart Understanding and Generation

TL;DR

ChartGalaxy tackles the challenge of inferring and generating infographic charts with large vision-language models by constructing a million-scale dataset of real and synthetic infographics, each paired with tabular data. It combines real-world templates and patterns with a human-in-the-loop synthesis pipeline to produce 1.7 million synthetic and 61k real charts across 75 types, 440 variations, and 68 layouts. The authors demonstrate the dataset’s value through three applications: infographic chart understanding via VQA, executable chart code generation benchmarks, and example-based infographic generation, reporting notable performance gains and qualitative improvements. This resource advances multimodal reasoning and content generation for infographics and provides a scalable foundation for LVLM training and evaluation, with potential impact on media, education, and data storytelling.

Abstract

Infographic charts are a powerful medium for communicating abstract data by combining visual elements (e.g., charts, images) with textual information. However, their visual and structural richness poses challenges for large vision-language models (LVLMs), which are typically trained on plain charts. To bridge this gap, we introduce ChartGalaxy, a million-scale dataset designed to advance the understanding and generation of infographic charts. The dataset is constructed through an inductive process that identifies 75 chart types, 440 chart variations, and 68 layout templates from real infographic charts and uses them to create synthetic ones programmatically. We showcase the utility of this dataset through: 1) improving infographic chart understanding via fine-tuning, 2) benchmarking code generation for infographic charts, and 3) enabling example-based infographic chart generation. By capturing the visual and structural complexity of real design, ChartGalaxy provides a useful resource for enhancing multimodal reasoning and generation in LVLMs.

Paper Structure

This paper contains 42 sections, 4 equations, 31 figures, 13 tables, 1 algorithm.

Figures (31)

  • Figure 1: ChartGalaxy, a million-scale dataset of synthetic and real infographic charts with data tables, supporting applications in infographic chart understanding, code generation, and chart generation.
  • Figure 2: Overview of our dataset construction method.
  • Figure 3: Examples of synthetic infographic charts in ChartGalaxy. The bottom-left illustration on each infographic chart shows the corresponding layout template.
  • Figure 4: Three examples of infographic charts used in Sec. \ref{['sec:user-study']}. In each example, A is the reference chart, B and C are generated by GPT-Image-1 and our method, respectively, using the same data.
  • Figure 5: 75 chart types and 440 chart variations (Part 1).
  • ...and 26 more figures