Improving Steering and Verification in AI-Assisted Data Analysis with Interactive Task Decomposition
Majeed Kazemitabaar, Jack Williams, Ian Drosos, Tovi Grossman, Austin Henley, Carina Negreanu, Advait Sarkar
TL;DR
This paper tackles the challenge of steering and verifying AI-assisted data analysis with large language models by introducing two interactive task-decomposition interfaces, Phasewise and Stepwise. Through a formative study and a controlled within-subjects experiment, the authors show that exposing editable AI assumptions, structured progression, and side conversations yields greater perceived control and easier verification compared with a standard conversational baseline. The work provides design guidelines for AI-assisted data analysis tools, highlighting trade-offs between information overload and control, and demonstrates the value of progressive disclosure and co-audit capabilities for reliable data analysis workflows. These insights have practical impact for building more trustworthy, auditable, and user-driven AI data-analysis assistants in real-world settings.
Abstract
LLM-powered tools like ChatGPT Data Analysis, have the potential to help users tackle the challenging task of data analysis programming, which requires expertise in data processing, programming, and statistics. However, our formative study (n=15) uncovered serious challenges in verifying AI-generated results and steering the AI (i.e., guiding the AI system to produce the desired output). We developed two contrasting approaches to address these challenges. The first (Stepwise) decomposes the problem into step-by-step subgoals with pairs of editable assumptions and code until task completion, while the second (Phasewise) decomposes the entire problem into three editable, logical phases: structured input/output assumptions, execution plan, and code. A controlled, within-subjects experiment (n=18) compared these systems against a conversational baseline. Users reported significantly greater control with the Stepwise and Phasewise systems, and found intervention, correction, and verification easier, compared to the baseline. The results suggest design guidelines and trade-offs for AI-assisted data analysis tools.
