Table of Contents
Fetching ...

"Don't Step on My Toes": Resolving Editing Conflicts in Real-Time Collaboration in Computational Notebooks

April Yi Wang, Zihan Wu, Christopher Brooks, Steve Oney

TL;DR

The paper tackles the problem of conflicts in real-time collaboration on computational notebooks by introducing PADLOCK, a JupyterLab extension that provides three mechanisms: cell-level access control to govern visibility and edits at the cell level, variable-level access control to protect runtime variables, and parallel cell groups to enable scoped, concurrent exploration. These features are designed to support diverse collaboration styles and reduce subtle, hard-to-debug interferences in shared notebooks. A laboratory study demonstrates that parallel cell groups are well-received and can improve notebook organization, while highlighting areas for future work such as activity histories and merge notifications, as well as the need for better awareness of collaborators' actions. Overall, PADLOCK advances collaborative data science by offering fine-grained, runtime-aware conflict prevention within a widely used notebook environment, with potential applications in classrooms and open collaboration.

Abstract

Real-time collaborative editing in computational notebooks can improve the efficiency of teamwork for data scientists. However, working together through synchronous editing of notebooks introduces new challenges. Data scientists may inadvertently interfere with each others' work by altering the shared codebase and runtime state if they do not set up a social protocol for working together and monitoring their collaborators' progress. In this paper, we propose a real-time collaborative editing model for resolving conflict edits in computational notebooks that introduces three levels of edit protection to help collaborators avoid introducing errors to both the program source code and changes to the runtime state.

"Don't Step on My Toes": Resolving Editing Conflicts in Real-Time Collaboration in Computational Notebooks

TL;DR

The paper tackles the problem of conflicts in real-time collaboration on computational notebooks by introducing PADLOCK, a JupyterLab extension that provides three mechanisms: cell-level access control to govern visibility and edits at the cell level, variable-level access control to protect runtime variables, and parallel cell groups to enable scoped, concurrent exploration. These features are designed to support diverse collaboration styles and reduce subtle, hard-to-debug interferences in shared notebooks. A laboratory study demonstrates that parallel cell groups are well-received and can improve notebook organization, while highlighting areas for future work such as activity histories and merge notifications, as well as the need for better awareness of collaborators' actions. Overall, PADLOCK advances collaborative data science by offering fine-grained, runtime-aware conflict prevention within a widely used notebook environment, with potential applications in classrooms and open collaboration.

Abstract

Real-time collaborative editing in computational notebooks can improve the efficiency of teamwork for data scientists. However, working together through synchronous editing of notebooks introduces new challenges. Data scientists may inadvertently interfere with each others' work by altering the shared codebase and runtime state if they do not set up a social protocol for working together and monitoring their collaborators' progress. In this paper, we propose a real-time collaborative editing model for resolving conflict edits in computational notebooks that introduces three levels of edit protection to help collaborators avoid introducing errors to both the program source code and changes to the runtime state.
Paper Structure (16 sections, 2 figures)

This paper contains 16 sections, 2 figures.

Figures (2)

  • Figure 1: Editing conflicts in real-time collaborative notebooks can be implicit. As shown on the left, one can get an unexpected execution result because the collaborator accidentally changed the shared variable. As shown on the right, PADLOCK helps data scientists resolve editing conflicts in real-time collaborative editing in computational notebooks.
  • Figure 2: Overview of the three conflict-free mechanisms in PADLOCK. (A) Cell-level access control allows collaborators to claim ownership of the code cells and restrict others from editing or viewing them: (A1) Unchecking edit access for a user will change the cell background and disable editing; (A2) Unchecking read access for a user will blur the cell. (B) Variable-level access control extends the idea of access control from cells to shared variables: (B1) Unchecking a user's variable access; (B2) After losing access, they will not be able to edit the variable and will receive warning when attempting to do so. (C) Parallel cell groups define a designated area where changes of the code and runtime state stay inside its own scope: (C1) One parallel cell group can contain multiple tabs; (C2) Each tab can contain multiple cells; (C3) Sync the variables from the global scope to the current active tab; (C4) Add a new tab; (C5) Click the radio button to mark it as the "main" tab.