Compendia: Automated Visual Storytelling Generation from Online Article Collection
Manusha Karunathilaka, Litian Lei, Yiming Gao, Yong Wang, Jiannan Li
TL;DR
Compendia tackles the challenge of generating coherent data stories from unstructured online article collections. It introduces a two-module framework—Data Fact Extraction and Organization, and Visual Storytelling—leveraging LLMs for retrieval, extraction, clustering, and narrative construction, presented in an interactive scrollytelling interface. The system is evaluated through quantitative accuracy metrics (e.g., 97.2% fact-content and data-point accuracy) and two user studies, demonstrating high usability and the ability to produce engaging, source-traceable narratives. The work advances automated storytelling from unstructured text and suggests future directions in fact-checking, temporal reasoning, and personalized storytelling to enhance trust and applicability in real-world search interfaces.
Abstract
In the digital age, readers value quantitative journalism that is clear, concise, analytical, and human-centred. To understand complex topics, they often piece together scattered facts from multiple articles. Visual storytelling can transform fragmented information into clear, engaging narratives, yet its use with unstructured online articles remains largely unexplored. To fill this gap, we present Compendia, an automated system that analyzes online articles in response to a user's query and generates a coherent data story tailored to the user's informational needs. Compendia addresses key challenges of storytelling from unstructured text through two modules covering: Online Article Retrieval, which gathers relevant articles; Data Fact Extraction, which identifies, validates, and refines quantitative facts; Fact Organization, which clusters and merges related facts into coherent thematic groups; and Visual Storytelling, which transforms the organized facts into narratives with visualizations in an interactive scrollytelling interface. We evaluated Compendia through a quantitative analysis, confirming the accuracy in fact extraction and organization, and through two user studies with 16 participants, demonstrating its usability, effectiveness, and ability to produce engaging visual stories for open-ended queries.
