ChartGalaxy: A Dataset for Infographic Chart Understanding and Generation
Zhen Li, Duan Li, Yukai Guo, Xinyuan Guo, Bowen Li, Lanxi Xiao, Shenyu Qiao, Jiashu Chen, Zijian Wu, Hui Zhang, Xinhuan Shu, Shixia Liu
TL;DR
ChartGalaxy tackles the challenge of inferring and generating infographic charts with large vision-language models by constructing a million-scale dataset of real and synthetic infographics, each paired with tabular data. It combines real-world templates and patterns with a human-in-the-loop synthesis pipeline to produce 1.7 million synthetic and 61k real charts across 75 types, 440 variations, and 68 layouts. The authors demonstrate the dataset’s value through three applications: infographic chart understanding via VQA, executable chart code generation benchmarks, and example-based infographic generation, reporting notable performance gains and qualitative improvements. This resource advances multimodal reasoning and content generation for infographics and provides a scalable foundation for LVLM training and evaluation, with potential impact on media, education, and data storytelling.
Abstract
Infographic charts are a powerful medium for communicating abstract data by combining visual elements (e.g., charts, images) with textual information. However, their visual and structural richness poses challenges for large vision-language models (LVLMs), which are typically trained on plain charts. To bridge this gap, we introduce ChartGalaxy, a million-scale dataset designed to advance the understanding and generation of infographic charts. The dataset is constructed through an inductive process that identifies 75 chart types, 440 chart variations, and 68 layout templates from real infographic charts and uses them to create synthetic ones programmatically. We showcase the utility of this dataset through: 1) improving infographic chart understanding via fine-tuning, 2) benchmarking code generation for infographic charts, and 3) enabling example-based infographic chart generation. By capturing the visual and structural complexity of real design, ChartGalaxy provides a useful resource for enhancing multimodal reasoning and generation in LVLMs.
