Table of Contents
Fetching ...

How Good is ChatGPT in Giving Advice on Your Visualization Design?

Nam Wook Kim, Yongsu Ahn, Grace Myers, Benjamin Bach

TL;DR

<3-5 sentence high-level summary> The paper investigates how well ChatGPT can answer visualization design questions and serve as a design assistant for practitioners lacking formal training. Using a mixed-methods approach, it compares ChatGPT responses to anonymous Human replies on VisGuides across six metrics and conducts a qualitative study of practitioners’ experiences with AI and human feedback. Findings show ChatGPT-4 delivers broader, clearer, and more actionable guidance than humans in many cases, but humans excel in depth, contextual understanding, and fluid conversations; participants value human feedback for bespoke, context-sensitive recommendations and trustworthiness. The work identifies opportunities to design LLM-based feedback systems that leverage AI for ideation while preserving human judgment, and it offers concrete design considerations for integrating such tools into visualization practice and education.

Abstract

Data visualization creators often lack formal training, resulting in a knowledge gap in design practice. Large language models such as ChatGPT, with their vast internet-scale training data, offer transformative potential to address this gap. In this study, we used both qualitative and quantitative methods to investigate how well ChatGPT can address visualization design questions. First, we quantitatively compared the ChatGPT-generated responses with anonymous online Human replies to data visualization questions on the VisGuides user forum. Next, we conducted a qualitative user study examining the reactions and attitudes of practitioners toward ChatGPT as a visualization design assistant. Participants were asked to bring their visualizations and design questions and received feedback from both Human experts and ChatGPT in randomized order. Our findings from both studies underscore ChatGPT's strengths, particularly its ability to rapidly generate diverse design options, while also highlighting areas for improvement, such as nuanced contextual understanding and fluid interaction dynamics beyond the chat interface. Drawing on these insights, we discuss design considerations for future LLM-based design feedback systems.

How Good is ChatGPT in Giving Advice on Your Visualization Design?

TL;DR

<3-5 sentence high-level summary> The paper investigates how well ChatGPT can answer visualization design questions and serve as a design assistant for practitioners lacking formal training. Using a mixed-methods approach, it compares ChatGPT responses to anonymous Human replies on VisGuides across six metrics and conducts a qualitative study of practitioners’ experiences with AI and human feedback. Findings show ChatGPT-4 delivers broader, clearer, and more actionable guidance than humans in many cases, but humans excel in depth, contextual understanding, and fluid conversations; participants value human feedback for bespoke, context-sensitive recommendations and trustworthiness. The work identifies opportunities to design LLM-based feedback systems that leverage AI for ideation while preserving human judgment, and it offers concrete design considerations for integrating such tools into visualization practice and education.

Abstract

Data visualization creators often lack formal training, resulting in a knowledge gap in design practice. Large language models such as ChatGPT, with their vast internet-scale training data, offer transformative potential to address this gap. In this study, we used both qualitative and quantitative methods to investigate how well ChatGPT can address visualization design questions. First, we quantitatively compared the ChatGPT-generated responses with anonymous online Human replies to data visualization questions on the VisGuides user forum. Next, we conducted a qualitative user study examining the reactions and attitudes of practitioners toward ChatGPT as a visualization design assistant. Participants were asked to bring their visualizations and design questions and received feedback from both Human experts and ChatGPT in randomized order. Our findings from both studies underscore ChatGPT's strengths, particularly its ability to rapidly generate diverse design options, while also highlighting areas for improvement, such as nuanced contextual understanding and fluid interaction dynamics beyond the chat interface. Drawing on these insights, we discuss design considerations for future LLM-based design feedback systems.
Paper Structure (67 sections, 4 figures, 4 tables)

This paper contains 67 sections, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Methodology Overview: The methodology comprises two key phases. In the first phase, questions answered within a forum space by Human respondents are explored and then presented to ChatGPT. The second phase involves a feedback session in which users solicit visualization design feedback from both ChatGPT and Human experts.
  • Figure 2: Mean scores (dots) and confidence intervals (error bars) across six metrics—Actionability, Breadth, Clarity, Coverage, Depth, and Topicality—for responses from ChatGPT 3.5, ChatGPT 4, and Human. Overall, ChatGPT 4 shows slightly higher mean scores than ChatGPT 3.5 and Human. Human responses exhibit higher variance but often perform comparably to ChatGPT.
  • Figure 3: Three selected and edited examples of questions and responses from the VisGuides forum. The ChatGPT responses consistently demonstrate breadth, topicality, and coverage. ChatGPT 4, in particular, showcases greater expertise by offering more in-depth and actionable advice in an authoritative manner, rather than merely listing ideas. The Human response ratings are more varied, as evident when comparing the first and last examples. Full questions and responses are provided in the supplement. Respondent IDs have been anonymized.
  • Figure 4: User preference for Human expert vs. ChatGPT responses in feedback sessions: The overarching pattern evident in these charts predominantly indicates a user preference for Human experts over ChatGPT in the context of feedback sessions.