Table of Contents
Fetching ...

SuperNOVA: Design Strategies and Opportunities for Interactive Visualization in Computational Notebooks

Zijie J. Wang, David Munechika, Seongmin Lee, Duen Horng Chau

TL;DR

The paper addresses the design gap for interactive notebook visualizations by conducting a large-scale systematic review of 163 notebook visualization tools (64 academic, 105 in the wild) and introducing an organizational framework that captures motivations, users, and four-dimensional design patterns. It combines a rigorous methodology—scraping 8.6 million notebooks, filtering 984 candidates down to 105 tools, and performing coding and quantitative analyses—to reveal how design choices relate to impact, notably showing that tools compatible with more notebook platforms tend to achieve higher GitHub stars and citations. The authors also present SuperNOVA, an open-source interactive explorer to browse and compare notebook visualization tools, and discuss design implications, trade-offs, and opportunities for democratizing tool creation, cross-platform integration, and responsible AI workflows. Collectively, the work offers practical guidance for researchers and developers to design more adoptable notebook visualizations and provides a community resource to inspire future work in notebook-based data exploration and storytelling.

Abstract

Computational notebooks, such as Jupyter Notebook, have become data scientists' de facto programming environments. Many visualization researchers and practitioners have developed interactive visualization tools that support notebooks, yet little is known about the appropriate design of these tools. To address this critical research gap, we investigate the design strategies in this space by analyzing 163 notebook visualization tools. Our analysis encompasses 64 systems from academic papers and 105 systems sourced from a pool of 55k notebooks containing interactive visualizations that we obtain via scraping 8.6 million notebooks on GitHub. Through this study, we identify key design implications and trade-offs, such as leveraging multimodal data in notebooks as well as balancing the degree of visualization-notebook integration. Furthermore, we provide empirical evidence that tools compatible with more notebook platforms have a greater impact. Finally, we develop SuperNOVA, an open-source interactive browser to help researchers explore existing notebook visualization tools. SuperNOVA is publicly accessible at: https://poloclub.github.io/supernova/.

SuperNOVA: Design Strategies and Opportunities for Interactive Visualization in Computational Notebooks

TL;DR

The paper addresses the design gap for interactive notebook visualizations by conducting a large-scale systematic review of 163 notebook visualization tools (64 academic, 105 in the wild) and introducing an organizational framework that captures motivations, users, and four-dimensional design patterns. It combines a rigorous methodology—scraping 8.6 million notebooks, filtering 984 candidates down to 105 tools, and performing coding and quantitative analyses—to reveal how design choices relate to impact, notably showing that tools compatible with more notebook platforms tend to achieve higher GitHub stars and citations. The authors also present SuperNOVA, an open-source interactive explorer to browse and compare notebook visualization tools, and discuss design implications, trade-offs, and opportunities for democratizing tool creation, cross-platform integration, and responsible AI workflows. Collectively, the work offers practical guidance for researchers and developers to design more adoptable notebook visualizations and provides a community resource to inspire future work in notebook-based data exploration and storytelling.

Abstract

Computational notebooks, such as Jupyter Notebook, have become data scientists' de facto programming environments. Many visualization researchers and practitioners have developed interactive visualization tools that support notebooks, yet little is known about the appropriate design of these tools. To address this critical research gap, we investigate the design strategies in this space by analyzing 163 notebook visualization tools. Our analysis encompasses 64 systems from academic papers and 105 systems sourced from a pool of 55k notebooks containing interactive visualizations that we obtain via scraping 8.6 million notebooks on GitHub. Through this study, we identify key design implications and trade-offs, such as leveraging multimodal data in notebooks as well as balancing the degree of visualization-notebook integration. Furthermore, we provide empirical evidence that tools compatible with more notebook platforms have a greater impact. Finally, we develop SuperNOVA, an open-source interactive browser to help researchers explore existing notebook visualization tools. SuperNOVA is publicly accessible at: https://poloclub.github.io/supernova/.
Paper Structure (26 sections, 10 figures)

This paper contains 26 sections, 10 figures.

Figures (10)

  • Figure 1: We present an organizational framework to characterize notebook visualization tools based on their design motivations and strategies through a review of 163 tools.
  • Figure 2: Many notebook visualization tools are developed for educators and students, such as GILP robbinsGILPInteractiveTool2023 which offers interactive and easy-to-understand visualizations to help students learn about linear programming algorithms. Educators can directly integrate GILP into notebook-based assignments.
  • Figure 3: Computational notebooks offer unique opportunities for visualization tools to read and refine users' artifacts, such as code, data, and models. For example, (A) LuxleeLuxAlwaysonVisualization2021 leverages a user's data transformation code to recommend visualizations, while (B) GAM ChangerwangInterpretabilityThenWhat2022a enables users to interactively edit an ML model's learned weights.
  • Figure 4: The vibrant notebook ecosystem enables developers to easily transfer their visualizations across various platforms. For example, (A) the Python library InterpretMLwangInterpretabilityThenWhat2022's notebook explainable ML visualizations are also used on (B) its documentation website via Jupyter Book executablebookscommunityJupyterBook2020.
  • Figure 5: The integration level between notebook and visualization tools varies based on data communication channels. (A) Tools such as Argo LiteliArgoLiteOpenSource2020 retrieve data from external servers instead of the notebook. (B) Visual AuditormunechikaVisualAuditorInteractive2022 visualizes different slices of the dataset that are sent from the notebook. (C) More integrated tools like pydeckuberDeckGlWebGL22016 not only visualize data from the notebook but also send data back to the notebook, for example, information on a user's selected map cells.
  • ...and 5 more figures