Table of Contents
Fetching ...

Kishu: Time-Traveling for Computational Notebooks

Zhaoheng Li, Supawit Chockchowwat, Ribhav Sahu, Areet Sheth, Yongjoo Park

TL;DR

Kishu tackles the problem of time-travel in computational notebooks by introducing an application-level, delta-driven approach that records state evolution at a novel Co-variable granularity. It employs live namespace patching, a Delta Detector, and a Checkpoint Graph to capture per-cell state deltas and enable incremental checkout in sub-second time, while preserving inter-variable dependencies. The system supports robust restoration through fallback recomputation when serialization fails or data are unserializable, and demonstrates compatibility with 146 data-science libraries, achieving up to 4.55x smaller checkpoint sizes and up to 9.02x faster checkouts compared to baselines. Empirical results show generalized time-traveling, low delta-detection overhead, and strong performance across diverse notebooks, making practical, fault-tolerant path exploration and undo feasible within a single kernel. The work offers significant practical impact by enabling efficient experimentation, debugging, and exploratory workflows in data science notebooks.

Abstract

Computational notebooks (e.g., Jupyter, Google Colab) are widely used by data scientists. A key feature of notebooks is the interactive computing model of iteratively executing cells (i.e., a set of statements) and observing the result (e.g., model or plot). Unfortunately, existing notebook systems do not offer time-traveling to past states: when the user executes a cell, the notebook session state consisting of user-defined variables can be irreversibly modified - e.g., the user cannot 'un-drop' a dataframe column. This is because, unlike DBMS, existing notebook systems do not keep track of the session state. Existing techniques for checkpointing and restoring session states, such as OS-level memory snapshot or application-level session dump, are insufficient: checkpointing can incur prohibitive storage costs and may fail, while restoration can only be inefficiently performed from scratch by fully loading checkpoint files. In this paper, we introduce a new notebook system, Kishu, that offers time-traveling to and from arbitrary notebook states using an efficient and fault-tolerant incremental checkpoint and checkout mechanism. Kishu creates incremental checkpoints that are small and correctly preserve complex inter-variable dependencies at a novel Co-variable granularity. Then, to return to a previous state, Kishu accurately identifies the state difference between the current and target states to perform incremental checkout at sub-second latency with minimal data loading. Kishu is compatible with 146 object classes from popular data science libraries (e.g., Ray, Spark, PyTorch), and reduces checkpoint size and checkout time by up to 4.55x and 9.02x, respectively, on a variety of notebooks.

Kishu: Time-Traveling for Computational Notebooks

TL;DR

Kishu tackles the problem of time-travel in computational notebooks by introducing an application-level, delta-driven approach that records state evolution at a novel Co-variable granularity. It employs live namespace patching, a Delta Detector, and a Checkpoint Graph to capture per-cell state deltas and enable incremental checkout in sub-second time, while preserving inter-variable dependencies. The system supports robust restoration through fallback recomputation when serialization fails or data are unserializable, and demonstrates compatibility with 146 data-science libraries, achieving up to 4.55x smaller checkpoint sizes and up to 9.02x faster checkouts compared to baselines. Empirical results show generalized time-traveling, low delta-detection overhead, and strong performance across diverse notebooks, making practical, fault-tolerant path exploration and undo feasible within a single kernel. The work offers significant practical impact by enabling efficient experimentation, debugging, and exploratory workflows in data science notebooks.

Abstract

Computational notebooks (e.g., Jupyter, Google Colab) are widely used by data scientists. A key feature of notebooks is the interactive computing model of iteratively executing cells (i.e., a set of statements) and observing the result (e.g., model or plot). Unfortunately, existing notebook systems do not offer time-traveling to past states: when the user executes a cell, the notebook session state consisting of user-defined variables can be irreversibly modified - e.g., the user cannot 'un-drop' a dataframe column. This is because, unlike DBMS, existing notebook systems do not keep track of the session state. Existing techniques for checkpointing and restoring session states, such as OS-level memory snapshot or application-level session dump, are insufficient: checkpointing can incur prohibitive storage costs and may fail, while restoration can only be inefficiently performed from scratch by fully loading checkpoint files. In this paper, we introduce a new notebook system, Kishu, that offers time-traveling to and from arbitrary notebook states using an efficient and fault-tolerant incremental checkpoint and checkout mechanism. Kishu creates incremental checkpoints that are small and correctly preserve complex inter-variable dependencies at a novel Co-variable granularity. Then, to return to a previous state, Kishu accurately identifies the state difference between the current and target states to perform incremental checkout at sub-second latency with minimal data loading. Kishu is compatible with 146 object classes from popular data science libraries (e.g., Ray, Spark, PyTorch), and reduces checkpoint size and checkout time by up to 4.55x and 9.02x, respectively, on a variety of notebooks.
Paper Structure (100 sections, 1 theorem, 24 figures, 8 tables)

This paper contains 100 sections, 1 theorem, 24 figures, 8 tables.

Key Result

Lemma 1

A Co-variable $\mathcal{X} = \{x_1,...,x_i\}$ can be updated by a cell execution only if at least one of $x_1,...,x_i$ was accessed in the code.

Figures (24)

  • Figure 1: Our system (attached to the kernel, right) enables time-traveling to and from arbitrary notebook states.
  • Figure 2: Pattern of a Sklearn notebook sklearntweet: (Top) many cells incrementally access a small portion of the state. (Bottom) Users balance data creation and modification.
  • Figure 3: Co-variables are connected components of objects. We can treat them as independent data tables.
  • Figure 4: Co-variable granularity deltas allows us to create size-efficient incremental checkpoints (vs. memory-page level deltas), and incrementally checkout to previous states.
  • Figure 5: Kishu architecture. It utilizes a hook to observe session state deltas and transparently write/replace data in the kernel namespace for incremental checkpoint/checkout.
  • ...and 19 more figures

Theorems & Definitions (7)

  • Definition 1
  • Definition 2
  • Definition 3
  • Lemma 1
  • Definition 4
  • Definition 5
  • Definition 6