Table of Contents
Fetching ...

WaitGPT: Monitoring and Steering Conversational LLM Agent in Data Analysis with On-the-Fly Code Visualization

Liwenhan Xie, Chengbo Zheng, Haijun Xia, Huamin Qu, Chen Zhu-Tian

TL;DR

This work addresses the reliability and usability challenges of LLM-powered data analysis by moving beyond raw code to an on-the-fly, interactive visualization of data operations. WaitGPT converts streaming code produced by LLMs into a growing diagram of data-operation nodes and runtime table glyphs, enabling real-time inspection, retrospective review, and granular refinement without regenerating code. A formative study (N=8) motivates design decisions, and a user study (N=12) demonstrates that WaitGPT improves error detection, user agency, and confidence in results compared to a code-only baseline. The approach offers practical benefits for monitoring, steering, and validating LLM-driven analytics and suggests design directions for more transparent human-AI collaboration in data workflows.

Abstract

Large language models (LLMs) support data analysis through conversational user interfaces, as exemplified in OpenAI's ChatGPT (formally known as Advanced Data Analysis or Code Interpreter). Essentially, LLMs produce code for accomplishing diverse analysis tasks. However, presenting raw code can obscure the logic and hinder user verification. To empower users with enhanced comprehension and augmented control over analysis conducted by LLMs, we propose a novel approach to transform LLM-generated code into an interactive visual representation. In the approach, users are provided with a clear, step-by-step visualization of the LLM-generated code in real time, allowing them to understand, verify, and modify individual data operations in the analysis. Our design decisions are informed by a formative study (N=8) probing into user practice and challenges. We further developed a prototype named WaitGPT and conducted a user study (N=12) to evaluate its usability and effectiveness. The findings from the user study reveal that WaitGPT facilitates monitoring and steering of data analysis performed by LLMs, enabling participants to enhance error detection and increase their overall confidence in the results.

WaitGPT: Monitoring and Steering Conversational LLM Agent in Data Analysis with On-the-Fly Code Visualization

TL;DR

This work addresses the reliability and usability challenges of LLM-powered data analysis by moving beyond raw code to an on-the-fly, interactive visualization of data operations. WaitGPT converts streaming code produced by LLMs into a growing diagram of data-operation nodes and runtime table glyphs, enabling real-time inspection, retrospective review, and granular refinement without regenerating code. A formative study (N=8) motivates design decisions, and a user study (N=12) demonstrates that WaitGPT improves error detection, user agency, and confidence in results compared to a code-only baseline. The approach offers practical benefits for monitoring, steering, and validating LLM-driven analytics and suggests design directions for more transparent human-AI collaboration in data workflows.

Abstract

Large language models (LLMs) support data analysis through conversational user interfaces, as exemplified in OpenAI's ChatGPT (formally known as Advanced Data Analysis or Code Interpreter). Essentially, LLMs produce code for accomplishing diverse analysis tasks. However, presenting raw code can obscure the logic and hinder user verification. To empower users with enhanced comprehension and augmented control over analysis conducted by LLMs, we propose a novel approach to transform LLM-generated code into an interactive visual representation. In the approach, users are provided with a clear, step-by-step visualization of the LLM-generated code in real time, allowing them to understand, verify, and modify individual data operations in the analysis. Our design decisions are informed by a formative study (N=8) probing into user practice and challenges. We further developed a prototype named WaitGPT and conducted a user study (N=12) to evaluate its usability and effectiveness. The findings from the user study reveal that WaitGPT facilitates monitoring and steering of data analysis performed by LLMs, enabling participants to enhance error detection and increase their overall confidence in the results.
Paper Structure (58 sections, 5 figures, 2 tables)

This paper contains 58 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: We propose a workflow that identifies data operations within the generated code and maps them to visual, interactive primitives on the fly. These primitives collectively offer an overview of the data analysis process.
  • Figure 2: A screenshot of the WaitGPT user interface. (A) An enlarged view of the flow diagram representing the code. (B) An illustration of the "table glyphs" that flow along the edge showing table dependency and changes during code generation. (C) Inspecting intermediate data by toggling the interactive table panel. (D) Interrogating LLM based on an operation.
  • Figure 3: An illustration of how the diagram grows with animated table glyphs during the code generation process.
  • Figure 4: The visualization offers multiple interactions for inspecting and refining the underlying data analysis. Users can: (A) toggle a table node to view the underlying data; (B) hover over a node to highlight its corresponding code; (C) modify a data operation using natural language; (D) directly manipulate the parameters of a node; and (E) view the resulting visualizations from the analysis.
  • Figure 5: User ratings on the baseline (code-only interface) and WaitGPT.