Table of Contents
Fetching ...

Tidynote: Always-Clear Notebook Authoring

Ruanqianqian Huang, Brian Hempel, Yining Cao, James D. Hollan, Haijun Xia, Sorin Lerner

TL;DR

An exploratory study of open-ended data analysis tasks shows that Tidynote features holistically promote clarity throughout a notebook's lifecycle, support realistic notebook tasks, and enable novel strategies for notebook clarity.

Abstract

Recent work identified clarity as one of the top quality attributes that notebook users value, but notebooks lack support for maintaining clarity throughout the exploratory phases of the notebook authoring workflow. We propose always-clear notebook authoring that supports both clarity and exploration, and present a Jupyter implementation called Tidynote. The key to Tidynote is three-fold: (1) a scratchpad sidebar to facilitate exploration, (2) cells movable between the notebook and the scratchpad to maintain organization, and (3) linear execution with state forks to clarify program state. An exploratory study (N=13) of open-ended data analysis tasks shows that Tidynote features holistically promote clarity throughout a notebook's lifecycle, support realistic notebook tasks, and enable novel strategies for notebook clarity. These results suggest that Tidynote supports maintaining clarity throughout the entirety of notebook authoring.

Tidynote: Always-Clear Notebook Authoring

TL;DR

An exploratory study of open-ended data analysis tasks shows that Tidynote features holistically promote clarity throughout a notebook's lifecycle, support realistic notebook tasks, and enable novel strategies for notebook clarity.

Abstract

Recent work identified clarity as one of the top quality attributes that notebook users value, but notebooks lack support for maintaining clarity throughout the exploratory phases of the notebook authoring workflow. We propose always-clear notebook authoring that supports both clarity and exploration, and present a Jupyter implementation called Tidynote. The key to Tidynote is three-fold: (1) a scratchpad sidebar to facilitate exploration, (2) cells movable between the notebook and the scratchpad to maintain organization, and (3) linear execution with state forks to clarify program state. An exploratory study (N=13) of open-ended data analysis tasks shows that Tidynote features holistically promote clarity throughout a notebook's lifecycle, support realistic notebook tasks, and enable novel strategies for notebook clarity. These results suggest that Tidynote supports maintaining clarity throughout the entirety of notebook authoring.
Paper Structure (23 sections, 6 figures, 4 tables)

This paper contains 23 sections, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Comparative analysis of different strategies for managing notebook messes. We compare the mechanisms of three existing strategies---out-of-notebook cells spadextensionWang2022:StickyLand, post-hoc cleaning ruleAidingCollaborativeReuse2018Head2019:Managing, and state forks weinmanForkItSupporting2021---with our envisioned always-clear notebook authoring, across five actions relevant to keeping a notebook clear throughout its lifecycle---performing exploration (global and cell-based), clearing unused exploration, iterating between exploration and clarity, and reducing state messes. "No additional support" means the actions are achievable manually, but the corresponding approach does not provide additional support beyond the base, traditional Jupyter; "Disallowed" means such actions are unattainable at all. Our approach is featured in the last row, the only strategy that supports keeping a notebook clear throughout its entire lifecycle.
  • Figure 2: A circle with letter A The full Tidynote interface, showing Ali's initial notebook with the scratchpad hidden. A circle with letter B Clicking the button expands the right margin to reveal the A circle with letter C scratchpad and moves the cell into it. A circle with letter D The button moves a cell back to the notebook. (Each code cell starts with two lines of automatically generated comments.)
  • Figure 3: Cells can be pinned A circle with letter E to prevent them from scrolling offscreen. The scratchpad can contain multiple scratch sections A circle with letter G~A circle with letter H~A circle with letter I, each with independent state from each other and from the main notebook. Multiple scratch sections might branch from the same main notebook cell, e.g.A circle with letter H~and~A circle with letter I from~A circle with letter J.
  • Figure 4: Rerunning the first cell ( df = ...) is a nonlinear execution: the output of later cells are grayed out A circle with letter L to indicate staleness.
  • Figure 5: Cell movement events & percentage time spent in notebook/scratchpad, ordered by % time spent in scratchpad from top to bottom. "Time through tasks" is the interval between the onset of the first notebook action after the tutorial, and the ending of the last action before the end of the tasks. Time progress is normalized across participants, labeled by their strategies for clarity (\ref{['subsec:results-strategies']}).
  • ...and 1 more figures