Table of Contents
Fetching ...

Flowco: Rethinking Data Analysis in the Age of LLMs

Stephen N. Freund, Brooke Simon, Emery D. Berger, Eunice Jun

TL;DR

Flowco presents a mixed-initiative system that uses a visual dataflow graph model to organize data analyses and integrate LLMs throughout the workflow. By enforcing modular, multi-layer abstractions, explicit dependencies, and robust validation (assertions and unit tests), Flowco aims to improve reliability, reproducibility, and accessibility compared to traditional notebooks and vanilla LLM usage. The authors demonstrate Flowco on clustering, multiverse analyses, and logistic regression, and validate its usability with a user study of 12 data-science students, who report benefits in organization, trust, and ease of use, despite latency concerns. The study and system design point to Flowco as a practical path toward robust, transparent, and scalable AI-assisted data analysis, with future work including hierarchical graphs, statistical correctness enhancements, and notebook integration.

Abstract

Conducting data analysis typically involves authoring code to transform, visualize, analyze, and interpret data. Large language models (LLMs) are now capable of generating such code for simple, routine analyses. LLMs promise to democratize data science by enabling those with limited programming expertise to conduct data analyses, including in scientific research, business, and policymaking. However, analysts in many real-world settings must often exercise fine-grained control over specific analysis steps, verify intermediate results explicitly, and iteratively refine their analytical approaches. Such tasks present barriers to building robust and reproducible analyses using LLMs alone or even in conjunction with existing authoring tools (e.g., computational notebooks). This paper introduces Flowco, a new mixed-initiative system to address these challenges. Flowco leverages a visual dataflow programming model and integrates LLMs into every phase of the authoring process. A user study suggests that Flowco supports analysts, particularly those with less programming experience, in quickly authoring, debugging, and refining data analyses.

Flowco: Rethinking Data Analysis in the Age of LLMs

TL;DR

Flowco presents a mixed-initiative system that uses a visual dataflow graph model to organize data analyses and integrate LLMs throughout the workflow. By enforcing modular, multi-layer abstractions, explicit dependencies, and robust validation (assertions and unit tests), Flowco aims to improve reliability, reproducibility, and accessibility compared to traditional notebooks and vanilla LLM usage. The authors demonstrate Flowco on clustering, multiverse analyses, and logistic regression, and validate its usability with a user study of 12 data-science students, who report benefits in organization, trust, and ease of use, despite latency concerns. The study and system design point to Flowco as a practical path toward robust, transparent, and scalable AI-assisted data analysis, with future work including hierarchical graphs, statistical correctness enhancements, and notebook integration.

Abstract

Conducting data analysis typically involves authoring code to transform, visualize, analyze, and interpret data. Large language models (LLMs) are now capable of generating such code for simple, routine analyses. LLMs promise to democratize data science by enabling those with limited programming expertise to conduct data analyses, including in scientific research, business, and policymaking. However, analysts in many real-world settings must often exercise fine-grained control over specific analysis steps, verify intermediate results explicitly, and iteratively refine their analytical approaches. Such tasks present barriers to building robust and reproducible analyses using LLMs alone or even in conjunction with existing authoring tools (e.g., computational notebooks). This paper introduces Flowco, a new mixed-initiative system to address these challenges. Flowco leverages a visual dataflow programming model and integrates LLMs into every phase of the authoring process. A user study suggests that Flowco supports analysts, particularly those with less programming experience, in quickly authoring, debugging, and refining data analyses.

Paper Structure

This paper contains 35 sections, 7 figures, 1 table.

Figures (7)

  • Figure 1: A Jupyter notebook exhibiting a potentially stale variable. To the left of each cell is the execution count, which indicates the order in which the cells were evaluated. The cell labeled with execution count 7 (highlighted) was evaluated after cluster_names was initialized in the cell with execution count of 5. The latter depends on the former, meaning that cluster_names could be stale.
  • Figure 2: The Flowco editor interface is divided into three panels: (Left) the project panel encompasses global actions and the "Ask Me Anything!" (AMA) chat box; (Center) the canvas is the visual editor for Flowco dataflow graphs; and (Right) the details panel presents details of the selected node during editing. green,188; blue,190, text=white,draw=none] A; The user creates a new node to load the dataset in beaks.csv. green,188; blue,190, text=white,draw=none] B; The user presses the Run button to synthesize code to evaluate the dataflow graph. green,188; blue,190, text=white,draw=none] C; After evaluating the node, Flowco provides a sample of the dataset in the canvas. green,188; blue,190, text=white,draw=none] D; The user examines the full dataset in the details panel. green,188; blue,190, text=white,draw=none] E; The user prompts the AMA chat box to "Describe the dataset". green,188; blue,190, text=white,draw=none] F; Flowco responds as it performs a number of analyses on the dataset.
  • Figure 3: green,188; blue,190, text=white,draw=none] G; The user adds two plotting nodes to the graph, as well as green,188; blue,190, text=white,draw=none] H; a node to select only the Fortis finches from the dataset. After running the graph, green,188; blue,190, text=white,draw=none] I; the user selects Select-Fortis to examine it in the details panel. green,188; blue,190, text=white,draw=none] J; The user exposes the synthesized code by selecting the "Code" abstraction level.
  • Figure 4: Flowco enables the user to validate run-time assertion checks on node outputs via the Checks view. After green,188; blue,190, text=white,draw=none] K; adding nodes to estimate the mean beak length for the Fortis finches green,188; blue,190, text=white,draw=none] L; the user switches to the Checks view and green,188; blue,190, text=white,draw=none] M; clicks the pencil icon that appears while hovering over Bootstrap-Average to bring up the dialog box shown below the screenshot. green,188; blue,190, text=white,draw=none] N; The user manually adds the check "Verify that the length of the bootstrap_average list is at least 5,000", and then green,188; blue,190, text=white,draw=none] O; clicks the Suggest button to have Flowco suggest several additional checks for that node. After saving the checks the dialog closes and green,188; blue,190, text=white,draw=none] P; the user then clicks the Check button to verify that all checks pass. green,188; blue,190, text=white,draw=none] Q; Flowco reports a failure for the Bootstrap-Average node.
  • Figure 6: The user directly modifies the components of a node via an editor dialog box. The dialog box allows the user to directly edit the node's title, summary label, requirements, and code. The user may also propagate changes in one component to others. For example, green,188; blue,190, text=white,draw=none] S; Alex adds a requirement, which green,188; blue,190, text=white,draw=none] T; brings up a warning that the node's different components may not be consistent. green,188; blue,190, text=white,draw=none] U; The user then clicks the propagation button to update the summary label and code. The editor also supports making modifications via chat, checking consistency between the components, and regenerating the components from scratch.
  • ...and 2 more figures