Table of Contents
Fetching ...

GistVis: Automatic Generation of Word-scale Visualizations from Data-rich Documents

Ruishi Zou, Yinqi Tang, Jingzhu Chen, Siyu Lu, Yan Lu, Yingfan Yang, Chen Ye

TL;DR

GistVis presents a modular, LLM-guided pipeline for automatically generating word-scale visualizations directly within data-rich documents to support document-centric reading. By encoding insights as data facts and mapping them through Discoverer, Annotator, Extractor, and Visualizer, it produces interactive word-scale visuals that link to text and adapt to six core data-fact types. Technical evaluation demonstrates competitive segmentation and labeling performance, while a user study (N=12) shows improved accuracy, reduced workload, and meaningful engagement with the visuals. The approach enables in situ data storytelling with potential to augment reading workflows, though its design space and data requirements warrant further expansion and integration efforts.

Abstract

Data-rich documents are ubiquitous in various applications, yet they often rely solely on textual descriptions to convey data insights. Prior research primarily focused on providing visualization-centric augmentation to data-rich documents. However, few have explored using automatically generated word-scale visualizations to enhance the document-centric reading process. As an exploratory step, we propose GistVis, an automatic pipeline that extracts and visualizes data insight from text descriptions. GistVis decomposes the generation process into four modules: Discoverer, Annotator, Extractor, and Visualizer, with the first three modules utilizing the capabilities of large language models and the fourth using visualization design knowledge. Technical evaluation including a comparative study on Discoverer and an ablation study on Annotator reveals decent performance of GistVis. Meanwhile, the user study (N=12) showed that GistVis could generate satisfactory word-scale visualizations, indicating its effectiveness in facilitating users' understanding of data-rich documents (+5.6% accuracy) while significantly reducing their mental demand (p=0.016) and perceived effort (p=0.033).

GistVis: Automatic Generation of Word-scale Visualizations from Data-rich Documents

TL;DR

GistVis presents a modular, LLM-guided pipeline for automatically generating word-scale visualizations directly within data-rich documents to support document-centric reading. By encoding insights as data facts and mapping them through Discoverer, Annotator, Extractor, and Visualizer, it produces interactive word-scale visuals that link to text and adapt to six core data-fact types. Technical evaluation demonstrates competitive segmentation and labeling performance, while a user study (N=12) shows improved accuracy, reduced workload, and meaningful engagement with the visuals. The approach enables in situ data storytelling with potential to augment reading workflows, though its design space and data requirements warrant further expansion and integration efforts.

Abstract

Data-rich documents are ubiquitous in various applications, yet they often rely solely on textual descriptions to convey data insights. Prior research primarily focused on providing visualization-centric augmentation to data-rich documents. However, few have explored using automatically generated word-scale visualizations to enhance the document-centric reading process. As an exploratory step, we propose GistVis, an automatic pipeline that extracts and visualizes data insight from text descriptions. GistVis decomposes the generation process into four modules: Discoverer, Annotator, Extractor, and Visualizer, with the first three modules utilizing the capabilities of large language models and the fourth using visualization design knowledge. Technical evaluation including a comparative study on Discoverer and an ablation study on Annotator reveals decent performance of GistVis. Meanwhile, the user study (N=12) showed that GistVis could generate satisfactory word-scale visualizations, indicating its effectiveness in facilitating users' understanding of data-rich documents (+5.6% accuracy) while significantly reducing their mental demand (p=0.016) and perceived effort (p=0.033).

Paper Structure

This paper contains 65 sections, 3 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Each element in data specification consists a four-tuple, space ❶, breakdown ❷, feature ❸ and value ❹.
  • Figure 2: The GistVis pipeline consists of four modules: Discoverer (M1), Annotator (M2), Extractor (M3), and Visualizer (M4). Data flows through the four modules sequentially, where a large language model captures the insight of the data-rich document (M1-M3). Visualizer (M4) maps the captured insight into interactive visualizations, populated in situ in the text document at word scale.
  • Figure 3: A collection of 14 candidate visualizations and the corresponding chart type for each data fact type. The Example column shows the effect of the appearance of word-scale visualization in data-rich documents. We present the examples when the mouse hovers over the word-scale visualization of focus.
  • Figure 4: Normalized confusion matrices for data fact type annotation results. The left matrix (A) shows the result of our two-step Annotator (Type Checker + Type Moderator), while the right matrix (B) shows the result of the ablated condition (Type Moderator only). The horizontal axis denotes the predicted type, while the vertical axis indicates the actual type. The numbers on the diagonal line of this matrix represent the precision of classification for each category.
  • Figure 5: The interface we employed for our user study. The data-rich document is rendered in the Document Panel on the left. In the middle is the Questions Panel, where participants answer multiple-choice or summary questions. The right is the Time and Control Panel, where we record the finishing time for this passage when participants click on the submit button.
  • ...and 3 more figures