Table of Contents
Fetching ...

Data Formulator 2: Iterative Creation of Data Visualizations, with AI Transforming Data Along the Way

Chenglong Wang, Bongshin Lee, Steven Drucker, Dan Marshall, Jianfeng Gao

TL;DR

Data Formulator 2 (DF2) tackles iterative visualization authoring by combining a shelf-style chart builder with natural language inputs and delegating data transformations to AI. It introduces data threads to capture non-linear authoring history and support branching, enabling reuse and backtracking across iterations. The system translates mixed inputs into Vega-Lite specifications and AI-generated data transformations, supporting diverse chart types while providing verification through derived data views and explanatory notes. A user study with eight participants demonstrates flexible iteration styles and effective collaboration with AI during challenging exploratory tasks, highlighting DF2's potential to reduce skill barriers and improve reproducibility in data visualization workflows.

Abstract

Data analysts often need to iterate between data transformations and chart designs to create rich visualizations for exploratory data analysis. Although many AI-powered systems have been introduced to reduce the effort of visualization authoring, existing systems are not well suited for iterative authoring. They typically require analysts to provide, in a single turn, a text-only prompt that fully describe a complex visualization. We introduce Data Formulator 2 (DF2 for short), an AI-powered visualization system designed to overcome this limitation. DF2 blends graphical user interfaces and natural language inputs to enable users to convey their intent more effectively, while delegating data transformation to AI. Furthermore, to support efficient iteration, DF2 lets users navigate their iteration history and reuse previous designs, eliminating the need to start from scratch each time. A user study with eight participants demonstrated that DF2 allowed participants to develop their own iteration styles to complete challenging data exploration sessions.

Data Formulator 2: Iterative Creation of Data Visualizations, with AI Transforming Data Along the Way

TL;DR

Data Formulator 2 (DF2) tackles iterative visualization authoring by combining a shelf-style chart builder with natural language inputs and delegating data transformations to AI. It introduces data threads to capture non-linear authoring history and support branching, enabling reuse and backtracking across iterations. The system translates mixed inputs into Vega-Lite specifications and AI-generated data transformations, supporting diverse chart types while providing verification through derived data views and explanatory notes. A user study with eight participants demonstrates flexible iteration styles and effective collaboration with AI during challenging exploratory tasks, highlighting DF2's potential to reduce skill barriers and improve reproducibility in data visualization workflows.

Abstract

Data analysts often need to iterate between data transformations and chart designs to create rich visualizations for exploratory data analysis. Although many AI-powered systems have been introduced to reduce the effort of visualization authoring, existing systems are not well suited for iterative authoring. They typically require analysts to provide, in a single turn, a text-only prompt that fully describe a complex visualization. We introduce Data Formulator 2 (DF2 for short), an AI-powered visualization system designed to overcome this limitation. DF2 blends graphical user interfaces and natural language inputs to enable users to convey their intent more effectively, while delegating data transformation to AI. Furthermore, to support efficient iteration, DF2 lets users navigate their iteration history and reuse previous designs, eliminating the need to start from scratch each time. A user study with eight participants demonstrated that DF2 allowed participants to develop their own iteration styles to complete challenging data exploration sessions.
Paper Structure (12 sections, 11 figures)

This paper contains 12 sections, 11 figures.

Figures (11)

  • Figure 1: An analyst explores electricity from different energy sources, renewable percentage trends, and country rankings by renewable percentages using a dataset on CO$_2$ and electricity for 20 countries (2000-2020, table 1). The analyst creates five data versions in three branches to support different chart designs. Df2 allows users to manage iteration directions and create rich visualizations using a blended UI and natural language inputs.
  • Figure 2: Df2 overview. Users create visualizations by providing fields (drag-and-drop or type) and NL instructions to the Chart Builder, delegating data transformation to AI. Data View shows derived data. Users navigate data history and select contexts for the next iteration using (the thread in use is displayed as local data threads). They refine or create new charts by providing instructions in Chart Builder. The main panel provides pop-up windows to inspect code, explanations, and chat history.
  • Figure 3: Experiences with Df2: (1) creating the basic renewable energy chart using drag-and-drop to encode fields; (2 and 3) creating charts requiring new fields by providing field names and optional natural language instructions to derive new data.
  • Figure 4: Iteration with Df2: (1) provide an instruction to filter the renewable energy percentage chart by top CO$_2$ countries, (2) update the chart with Global Median? and instruct Df2 to add the global median alongside the top 5 CO$_2$ countries' trends, and (3) move Global Median? from column to opacity to update the chart design without deriving new data.
  • Figure 5: Df2's workflow: (1) Df2 generates a Vega-Lite spec skeleton based on user specifications and chart type. (2) If new fields (e.g., Rank) are required, Df2 prompts its AI model to generate data transformation code. (3) The Vega-Lite skeleton is then instantiated with the new data to produce the desired chart.
  • ...and 6 more figures