Xavier: Toward Better Coding Assistance in Authoring Tabular Data Wrangling Scripts
Yunfan Zhou, Xiwen Cai, Qiming Shi, Yanwei Huang, Haotian Li, Huamin Qu, Di Weng, Yingcai Wu
TL;DR
This work tackles the misalignment between data contexts and AI-driven code completions in data wrangling tasks. It introduces Xavier, a computational notebook extension that couples code context with three-dimensional data contexts (tables, columns, rows) to deliver data-context-aware code suggestions, automatic data highlighting, and real-time transformation previews. A preliminary study informs design requirements, and a user study with 16 analysts demonstrates that Xavier substantially reduces context switches and errors during scripting, with positive user feedback on its transparency and verification aids. The findings suggest that integrating data contexts into coding assistance can significantly improve the efficiency, accuracy, and trust of data wrangling workflows, with potential for broader adoption across data tools and languages.
Abstract
Data analysts frequently employ code completion tools in writing custom scripts to tackle complex tabular data wrangling tasks. However, existing tools do not sufficiently link the data contexts such as schemas and values with the code being edited. This not only leads to poor code suggestions, but also frequent interruptions in coding processes as users need additional code to locate and understand relevant data. We introduce Xavier, a tool designed to enhance data wrangling script authoring in computational notebooks. Xavier maintains users' awareness of data contexts while providing data-aware code suggestions. It automatically highlights the most relevant data based on the user's code, integrates both code and data contexts for more accurate suggestions, and instantly previews data transformation results for easy verification. To evaluate the effectiveness and usability of Xavier, we conducted a user study with 16 data analysts, showing its potential to streamline data wrangling scripts authoring.
