Table of Contents
Fetching ...

What Do We Mean When We Talk About Data Storytelling?

Leni Yang, Zezhong Wang, Xingyu Lan

TL;DR

This study conducts the first systematic review of how data storytelling is defined in academia, compiling a corpus of 233 publications and extracting 96 explicit definitions. Through thematic and affinity-map analyses, it identifies four core themes (What/How/Why/Who) and five definitional perspectives (D1–D5), revealing substantial variability and unresolved tensions between data-centric and narrative-oriented views. The authors then compare data storytelling to narratology, focusing on Event, Character, and Plot, to illuminate where data storytelling aligns with or diverges from classical storytelling. The work concludes with implications for terminology, measurement, and future research directions, advocating more precise language and broader theoretical grounding to support cross-disciplinary communication and methodological reuse.

Abstract

We have witnessed rapid growth in data storytelling research. Scholars from multiple disciplines have contributed new theories and techniques surrounding data storytelling. However, with this prolific development, a fuzzy boundary of data storytelling comes. We argue that understanding how "data storytelling" has been defined and interpreted by academia is crucial for facilitating communication between researchers, encouraging the consistent use of concepts and measures, assisting newcomers in approaching and positioning their research in this area, and enabling the effective application of relevant techniques and tools. Thus, it is necessary to systematically reflect on "what is data storytelling" and promote a more thorough understanding of this concept. Specifically, we investigated how existing research has conceptualized "data storytelling." As a result, we identified 96 publications that provide explicit definitions. By coding these definitions in-depth, we identified five paradigms of defining data storytelling, as well as a broad spectrum of interpretations regarding the content, objectives, and techniques of data storytelling. Finally, we concluded with implications for future research, aiming to foster nuanced communication about "data storytelling," suggest research opportunities, and establish a more inclusive theoretical foundation for this research direction.

What Do We Mean When We Talk About Data Storytelling?

TL;DR

This study conducts the first systematic review of how data storytelling is defined in academia, compiling a corpus of 233 publications and extracting 96 explicit definitions. Through thematic and affinity-map analyses, it identifies four core themes (What/How/Why/Who) and five definitional perspectives (D1–D5), revealing substantial variability and unresolved tensions between data-centric and narrative-oriented views. The authors then compare data storytelling to narratology, focusing on Event, Character, and Plot, to illuminate where data storytelling aligns with or diverges from classical storytelling. The work concludes with implications for terminology, measurement, and future research directions, advocating more precise language and broader theoretical grounding to support cross-disciplinary communication and methodological reuse.

Abstract

We have witnessed rapid growth in data storytelling research. Scholars from multiple disciplines have contributed new theories and techniques surrounding data storytelling. However, with this prolific development, a fuzzy boundary of data storytelling comes. We argue that understanding how "data storytelling" has been defined and interpreted by academia is crucial for facilitating communication between researchers, encouraging the consistent use of concepts and measures, assisting newcomers in approaching and positioning their research in this area, and enabling the effective application of relevant techniques and tools. Thus, it is necessary to systematically reflect on "what is data storytelling" and promote a more thorough understanding of this concept. Specifically, we investigated how existing research has conceptualized "data storytelling." As a result, we identified 96 publications that provide explicit definitions. By coding these definitions in-depth, we identified five paradigms of defining data storytelling, as well as a broad spectrum of interpretations regarding the content, objectives, and techniques of data storytelling. Finally, we concluded with implications for future research, aiming to foster nuanced communication about "data storytelling," suggest research opportunities, and establish a more inclusive theoretical foundation for this research direction.

Paper Structure

This paper contains 30 sections, 3 figures.

Figures (3)

  • Figure 1: The distributions of the paper types, subject areas, and application domains of the publications in our corpus over time. Note that only 101 papers have explicitly claimed which domain(s) they anticipate data stories will be applied in.
  • Figure 2: The coding results of data storytelling definitions.
  • Figure 3: Examples of data stories that reveal various interpretations of narrative elements: (A) a data story generated by Calliope a12shi2020calliope, (B) the story Iraq's Bloody Tolliraq2011 with an abstract "data character" labeled by a472dasu2023character, (C) the story 200 Countries, 200 Years, 4 Minutes with a narrator shown in the camera, (D) a data story plots design that applies the Hero's Journey structure to tell disease data a476mittenentzwei2023disease, and (E) the story Brooke Leave Home with an imaginary protagonist to tell open data about the care system in England a39concannon2020brooke.