Table of Contents
Fetching ...

Leveraging Foundation Models for Crafting Narrative Visualization: A Survey

Yi He, Ke Xu, Shixiong Cao, Yang Shi, Qing Chen, Nan Cao

TL;DR

This survey addresses the challenge of crafting data narratives by integrating foundation models across four phases: Analysis, Narration, Visualization, and Interaction. It analyzes 66–77 papers (varies by section) to map how large language and multimodal models assist data enrichment, insight extraction, narrative logic, description, chart generation, authoring, interpretation, and navigation. The authors propose a structured reference model and identify eight (nine in places) tasks to guide research and practice, while discussing limitations such as hallucination, evaluation gaps, and limited chart types. The work contributes a taxonomy, critical insights, and an interactive browser to support researchers and practitioners in developing narrative visualizations with foundation models, aiming to raise quality, trust, and accessibility in data storytelling.

Abstract

Narrative visualization transforms data into engaging stories, making complex information accessible to a broad audience. Foundation models, with their advanced capabilities such as natural language processing, content generation, and multimodal integration, hold substantial potential for enriching narrative visualization. Recently, a collection of techniques have been introduced for crafting narrative visualizations based on foundation models from different aspects. We build our survey upon 66 papers to study how foundation models can progressively engage in this process and then propose a reference model categorizing the reviewed literature into four essential phases: Analysis, Narration, Visualization, and Interaction. Furthermore, we identify eight specific tasks (e.g. Insight Extraction and Authoring) where foundation models are applied across these stages to facilitate the creation of visual narratives. Detailed descriptions, related literature, and reflections are presented for each task. To make it a more impactful and informative experience for diverse readers, we discuss key research problems and provide the strengths and weaknesses in each task to guide people in identifying and seizing opportunities while navigating challenges in this field.

Leveraging Foundation Models for Crafting Narrative Visualization: A Survey

TL;DR

This survey addresses the challenge of crafting data narratives by integrating foundation models across four phases: Analysis, Narration, Visualization, and Interaction. It analyzes 66–77 papers (varies by section) to map how large language and multimodal models assist data enrichment, insight extraction, narrative logic, description, chart generation, authoring, interpretation, and navigation. The authors propose a structured reference model and identify eight (nine in places) tasks to guide research and practice, while discussing limitations such as hallucination, evaluation gaps, and limited chart types. The work contributes a taxonomy, critical insights, and an interactive browser to support researchers and practitioners in developing narrative visualizations with foundation models, aiming to raise quality, trust, and accessibility in data storytelling.

Abstract

Narrative visualization transforms data into engaging stories, making complex information accessible to a broad audience. Foundation models, with their advanced capabilities such as natural language processing, content generation, and multimodal integration, hold substantial potential for enriching narrative visualization. Recently, a collection of techniques have been introduced for crafting narrative visualizations based on foundation models from different aspects. We build our survey upon 66 papers to study how foundation models can progressively engage in this process and then propose a reference model categorizing the reviewed literature into four essential phases: Analysis, Narration, Visualization, and Interaction. Furthermore, we identify eight specific tasks (e.g. Insight Extraction and Authoring) where foundation models are applied across these stages to facilitate the creation of visual narratives. Detailed descriptions, related literature, and reflections are presented for each task. To make it a more impactful and informative experience for diverse readers, we discuss key research problems and provide the strengths and weaknesses in each task to guide people in identifying and seizing opportunities while navigating challenges in this field.
Paper Structure (23 sections, 17 figures, 1 table)

This paper contains 23 sections, 17 figures, 1 table.

Figures (17)

  • Figure 1: A Reference Model for Creating Narrative Visualizations Based on Foundation Models
  • Figure 2: An overview of the papers related to the reference model and their associated code. Papers marked with (*) represent tools that can be used in creating narrative visualizations, even if the primary focus of the research is not specifically on narrative visualization. A clearer and more detailed collection of the papers can be found at http://lm4vis.idvxlab.com/.
  • Figure 3: Selected examples of feature embedding: (1) Use text embedding technology to capture the deep semantic information of text, supporting the construction of graph-based narrative visualizations10.1145/3581641.3584076. (2) ADVISor: Use BERT to convert the natural language in table headers and questions to vectors presenting the semantic meaningliu2021advisor. (3) Erato: schematic diagrams of the fact embedding modelsun2022erato. (4) Chart2Vec: learn a universal embedding of visualizations with context-aware information 10485458.
  • Figure 4: Selected examples of editing. (1) Epigraphics: message-driven infographics authoring 10.1145/3613904.3642172. (2) FinFlier: layer visual elements onto charts of financial narrative visualizations 10787087.(3) ChartSpark: embedding semantic context into charts with text-to-image generative model xiao2023let. (4) LIDA: generate stylized graphics based on visualizations for infographics dibia2023lida.
  • Figure 5: Selected examples of Chart Understanding. Chart Understanding is divided into three tasks: Chart Summarization: (1) Unicharts: a model that processes visual elements through a chart encoder and text decoder to generate natural language summariesmasry2023unichart; Chart Question Answering: (2) STL-CQA: a structured Transformer model to localize chart elements and perform cross-modal reasoning singh-shekhar-2020-stl, (3) OpenCQA: Multiple foundation models were fine-tuned to answer open-ended questions with chartskantharaj-etal-2022-opencqa; Chart Retrieval: (4) WYTIWYR: a user intent-aware framework with multimodal inputs for visualization retrievalxiao2023wytiwyr.
  • ...and 12 more figures