Table of Contents
Fetching ...

Challenges & Opportunities with LLM-Assisted Visualization Retargeting

Luke S. Snyder, Chenglong Wang, Steven M. Drucker

TL;DR

This paper studies the challenges of retargeting visualization code to new data using LLMs, comparing a direct code-generation baseline to a constrained program-synthesis pipeline. It evaluates performance across multiple datasets and chart types, finding that data transformation and encoding updates are critical for successful retargeting, while the pipeline often introduces more syntactic errors and struggles with complex data. The authors provide actionable recommendations, including mixed-initiative interfaces, explicit data-dependency surfacing, and transformation-aware controls to mitigate errors. The work highlights that while LLM-assisted retargeting holds promise, practical systems must combine interactive guidance with robust data-preparation to achieve reliable results in real-world scenarios.

Abstract

Despite the ubiquity of visualization examples published on the web, retargeting existing custom chart implementations to new datasets remains difficult, time-intensive, and tedious. The adaptation process assumes author familiarity with both the implementation of the example as well as how the new dataset might need to be transformed to fit into the example code. With recent advances in Large Language Models (LLMs), automatic adaptation of code can be achieved from high-level user prompts, reducing the barrier for visualization retargeting. To better understand how LLMs can assist retargeting and its potential limitations, we characterize and evaluate the performance of LLM assistance across multiple datasets and charts of varying complexity, categorizing failures according to type and severity. In our evaluation, we compare two approaches: (1) directly instructing the LLM model to fully generate and adapt code by treating code as text inputs and (2) a more constrained program synthesis pipeline where the LLM guides the code construction process by providing structural information (e.g., visual encodings) based on properties of the example code and data. We find that both approaches struggle when new data has not been appropriately transformed, and discuss important design recommendations for future retargeting systems.

Challenges & Opportunities with LLM-Assisted Visualization Retargeting

TL;DR

This paper studies the challenges of retargeting visualization code to new data using LLMs, comparing a direct code-generation baseline to a constrained program-synthesis pipeline. It evaluates performance across multiple datasets and chart types, finding that data transformation and encoding updates are critical for successful retargeting, while the pipeline often introduces more syntactic errors and struggles with complex data. The authors provide actionable recommendations, including mixed-initiative interfaces, explicit data-dependency surfacing, and transformation-aware controls to mitigate errors. The work highlights that while LLM-assisted retargeting holds promise, practical systems must combine interactive guidance with robust data-preparation to achieve reliable results in real-world scenarios.

Abstract

Despite the ubiquity of visualization examples published on the web, retargeting existing custom chart implementations to new datasets remains difficult, time-intensive, and tedious. The adaptation process assumes author familiarity with both the implementation of the example as well as how the new dataset might need to be transformed to fit into the example code. With recent advances in Large Language Models (LLMs), automatic adaptation of code can be achieved from high-level user prompts, reducing the barrier for visualization retargeting. To better understand how LLMs can assist retargeting and its potential limitations, we characterize and evaluate the performance of LLM assistance across multiple datasets and charts of varying complexity, categorizing failures according to type and severity. In our evaluation, we compare two approaches: (1) directly instructing the LLM model to fully generate and adapt code by treating code as text inputs and (2) a more constrained program synthesis pipeline where the LLM guides the code construction process by providing structural information (e.g., visual encodings) based on properties of the example code and data. We find that both approaches struggle when new data has not been appropriately transformed, and discuss important design recommendations for future retargeting systems.

Paper Structure

This paper contains 10 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: Baseline retargeting errors across 32 different chart instances. Higher errors are observed for more complex charts (annotations, regression) and datasets with high (IMDb) or low (US Unemployment) complexity.
  • Figure 2: Example retargeted chart specifications for both LLM baseline and pipeline. (A-C) Retargeted Matplotlib box plot to brain tumor dataset. (D-F) Retargeted Seaborn scatter plot with regression to IMDb Movies dataset; the LLM pipeline fails to render the chart (F).