Table of Contents
Fetching ...

DATAWEAVER: Authoring Data-Driven Narratives through the Integrated Composition of Visualization and Text

Yu Fu, Dennis Bromley, Vidya Setlur

TL;DR

DataWeaver addresses the challenge of producing cohesive data-driven stories by offering a bidirectional authoring framework that tightly couples visualizations and narratives through a callout-driven data-fact layer. It supports vis-to-text and text-to-vis workflows within a flow-based UI, leveraging LLMs to generate narratives or charts while keeping raw data out of the models for privacy. An evaluation with 13 participants and a 1-week diary study reports high usability (SUS of 85.77) and positive impact on authoring efficiency, while highlighting needs for improved data-facts filtering, customization, and more robust text-first chart generation under data constraints. The work demonstrates a practical path for human-AI collaboration in data storytelling, enabling flexible workflows, reduced transcription effort, and scalable integration of visual and textual narrative components.

Abstract

Data-driven storytelling has gained prominence in journalism and other data reporting fields. However, the process of creating these stories remains challenging, often requiring the integration of effective visualizations with compelling narratives to form a cohesive, interactive presentation. To help streamline this process, we present an integrated authoring framework and system, DataWeaver, that supports both visualization-to-text and text-to-visualization composition. DataWeaver enables users to create data narratives anchored to data facts derived from "call-out" interactions, i.e., user-initiated highlights of visualization elements that prompt relevant narrative content. In addition to this "vis-to-text" composition, DataWeaver also supports a "text-initiated" approach, generating relevant interactive visualizations from existing narratives. Key findings from an evaluation with 13 participants highlighted the utility and usability of DataWeaver and the effectiveness of its integrated authoring framework. The evaluation also revealed opportunities to enhance the framework by refining filtering mechanisms and visualization recommendations and better support authoring creativity by introducing advanced customization options.

DATAWEAVER: Authoring Data-Driven Narratives through the Integrated Composition of Visualization and Text

TL;DR

DataWeaver addresses the challenge of producing cohesive data-driven stories by offering a bidirectional authoring framework that tightly couples visualizations and narratives through a callout-driven data-fact layer. It supports vis-to-text and text-to-vis workflows within a flow-based UI, leveraging LLMs to generate narratives or charts while keeping raw data out of the models for privacy. An evaluation with 13 participants and a 1-week diary study reports high usability (SUS of 85.77) and positive impact on authoring efficiency, while highlighting needs for improved data-facts filtering, customization, and more robust text-first chart generation under data constraints. The work demonstrates a practical path for human-AI collaboration in data storytelling, enabling flexible workflows, reduced transcription effort, and scalable integration of visual and textual narrative components.

Abstract

Data-driven storytelling has gained prominence in journalism and other data reporting fields. However, the process of creating these stories remains challenging, often requiring the integration of effective visualizations with compelling narratives to form a cohesive, interactive presentation. To help streamline this process, we present an integrated authoring framework and system, DataWeaver, that supports both visualization-to-text and text-to-visualization composition. DataWeaver enables users to create data narratives anchored to data facts derived from "call-out" interactions, i.e., user-initiated highlights of visualization elements that prompt relevant narrative content. In addition to this "vis-to-text" composition, DataWeaver also supports a "text-initiated" approach, generating relevant interactive visualizations from existing narratives. Key findings from an evaluation with 13 participants highlighted the utility and usability of DataWeaver and the effectiveness of its integrated authoring framework. The evaluation also revealed opportunities to enhance the framework by refining filtering mechanisms and visualization recommendations and better support authoring creativity by introducing advanced customization options.

Paper Structure

This paper contains 17 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: The integrated framework underlying DataWeaver. The interface is detailed in \ref{['fig:system_interface']}. The purple-colored flow shows the Vis-to-Text composition (\ref{['fig:vis-to-text']}). The teal-colored flow demonstrates Text-to-Vis composition (\ref{['fig:Text-to-vis']})
  • Figure 2: An overview of DataWeaver's interface. The Authoring CanvasA is a flow-based, zoomable interface where users can add vis-nodes (V) using a visualization engine and text-nodes (T), as well as edges to connect them. The Insight CartB is used for insight management while the Review PageC allows users to reorder the story and convert the story pieces into different presentation formats.
  • Figure 3: Demonstration of Vis-to-Text composition workflow. After users apply callout interaction to visualization (S1), DataWeaver computes the data facts and presents (S2) them in the insight cart. Users then select desired data facts (S3). An LLM then generates data narratives (S4) based on the selected data facts and metadata. Users can revise the generated narratives using the buttons (S5).
  • Figure 4: Demonstration of Text-to-Vis composition workflow. Users first type or select text to focus (S1), and DataWeaver retrieves and processes the datasets from upstream nodes (S2). An LLM then interprets the text and metadata and recommends relevant charts (S3). Based on the charts' types, LLM then generates JSON specifications that contain both data operation and visualization schemas used to create interactive charts (S4). Users finally review the generated charts and add desired ones as new vis-nodes (S5).
  • Figure 5: The charts depict questionnaire response distributions on a 5-point Likert scale. The top chart presents results of nine utility questions, while the bottom two display usability questions. White lines and numerical values represent adjusted scores (out of 100), calculated using the SUS algorithm brooke1996sus.