Table of Contents
Fetching ...

ChartGPT: Leveraging LLMs to Generate Charts from Abstract Natural Language

Yuan Tian, Weiwei Cui, Dazhen Deng, Xinjing Yi, Yurun Yang, Haidong Zhang, Yingcai Wu

TL;DR

ChartGPT tackles the problem of generating accurate charts from abstract natural language by decomposing the task into six sub-tasks and fine-tuning a domain-specific LLM (FLAN-T5-XL) on a nvBench-derived abstract utterance dataset. The approach uses a step-by-step reasoning pipeline, a Vega-Lite-based template, and an interactive interface that allows users to inspect and adjust intermediate outputs. Quantitative evaluation and a user study show ChartGPT outperforms NL4DV and ncNet in consistency and similarity, and users report flexible, semantically-aware chart generation with meaningful opportunities for exploration. The work provides a practical framework for controllable NL2VIS and contributes a valuable dataset for future research, while outlining clear directions to expand transformations, handle follow-ups, and scale to larger data tables.

Abstract

The use of natural language interfaces (NLIs) to create charts is becoming increasingly popular due to the intuitiveness of natural language interactions. One key challenge in this approach is to accurately capture user intents and transform them to proper chart specifications. This obstructs the wide use of NLI in chart generation, as users' natural language inputs are generally abstract (i.e., ambiguous or under-specified), without a clear specification of visual encodings. Recently, pre-trained large language models (LLMs) have exhibited superior performance in understanding and generating natural language, demonstrating great potential for downstream tasks. Inspired by this major trend, we propose ChartGPT, generating charts from abstract natural language inputs. However, LLMs are struggling to address complex logic problems. To enable the model to accurately specify the complex parameters and perform operations in chart generation, we decompose the generation process into a step-by-step reasoning pipeline, so that the model only needs to reason a single and specific sub-task during each run. Moreover, LLMs are pre-trained on general datasets, which might be biased for the task of chart generation. To provide adequate visualization knowledge, we create a dataset consisting of abstract utterances and charts and improve model performance through fine-tuning. We further design an interactive interface for ChartGPT that allows users to check and modify the intermediate outputs of each step. The effectiveness of the proposed system is evaluated through quantitative evaluations and a user study.

ChartGPT: Leveraging LLMs to Generate Charts from Abstract Natural Language

TL;DR

ChartGPT tackles the problem of generating accurate charts from abstract natural language by decomposing the task into six sub-tasks and fine-tuning a domain-specific LLM (FLAN-T5-XL) on a nvBench-derived abstract utterance dataset. The approach uses a step-by-step reasoning pipeline, a Vega-Lite-based template, and an interactive interface that allows users to inspect and adjust intermediate outputs. Quantitative evaluation and a user study show ChartGPT outperforms NL4DV and ncNet in consistency and similarity, and users report flexible, semantically-aware chart generation with meaningful opportunities for exploration. The work provides a practical framework for controllable NL2VIS and contributes a valuable dataset for future research, while outlining clear directions to expand transformations, handle follow-ups, and scale to larger data tables.

Abstract

The use of natural language interfaces (NLIs) to create charts is becoming increasingly popular due to the intuitiveness of natural language interactions. One key challenge in this approach is to accurately capture user intents and transform them to proper chart specifications. This obstructs the wide use of NLI in chart generation, as users' natural language inputs are generally abstract (i.e., ambiguous or under-specified), without a clear specification of visual encodings. Recently, pre-trained large language models (LLMs) have exhibited superior performance in understanding and generating natural language, demonstrating great potential for downstream tasks. Inspired by this major trend, we propose ChartGPT, generating charts from abstract natural language inputs. However, LLMs are struggling to address complex logic problems. To enable the model to accurately specify the complex parameters and perform operations in chart generation, we decompose the generation process into a step-by-step reasoning pipeline, so that the model only needs to reason a single and specific sub-task during each run. Moreover, LLMs are pre-trained on general datasets, which might be biased for the task of chart generation. To provide adequate visualization knowledge, we create a dataset consisting of abstract utterances and charts and improve model performance through fine-tuning. We further design an interactive interface for ChartGPT that allows users to check and modify the intermediate outputs of each step. The effectiveness of the proposed system is evaluated through quantitative evaluations and a user study.
Paper Structure (36 sections, 9 figures)

This paper contains 36 sections, 9 figures.

Figures (9)

  • Figure 1: An example of chart generation problem formulation. (a) The task comprises three stages: input context (data table and natural language), formatted visualization specification, and charts. (b) We decompose the first stage transformation process into two successive transformations: data transformation (b1) and visualization transformation (b2), involving six steps. At each step, the model utilizes the input context and previous answers to generate the next output.
  • Figure 2: ChartGPT overview. ChartGPT takes a data table and an utterance provided by the user as input (a). To generate the chart, ChartGPT employs a step-by-step transformation process (b) that decomposes the chart generation task into six sequential steps (b1). Each step is solved by the LLM fine-tuned on our constructed dataset (b2). By leveraging the output from each step, ChartGPT generates visualization specifications and presents charts to the user (c).
  • Figure 3: The template sequence for each sub-task.
  • Figure 4: The statistics of our constructed dataset. Specifically, "abstract" denotes our generated abstract utterances, "original" denotes the maintained original utterances from nvBench, and "total" denotes our total dataset, which includes the "abstract" and "original" utterances.
  • Figure 5: The Turing test results between our generated utterances and NLV Corpus ones. (a) The rate of wrong judgment of each subject. (b) The average rate of the two sets that were judged as human-created.
  • ...and 4 more figures