Table of Contents
Fetching ...

Visualization of missing data: a state-of-the-art survey

Sarah Alsufyani, Matthew Forshaw, Sara Johansson Fernstad

TL;DR

Missing data visualization is underexplored but critical for understanding data quality and guiding imputation. The authors conduct a systematic literature review, propose a comprehensive taxonomy across data types, interactivity, and tasks, and categorize existing work into techniques, applications/tools, and evaluations. Key contributions include a state-of-the-art synthesis, the MissVisG/MissVis visualizations, nabular data concepts, and a structured overview of tools and evaluations, plus guidance on future research. The findings underscore the need for broader data-type support, realistic evaluations, and tighter integration with imputation workflows to improve practical data analysis.

Abstract

Missing data, the data value that is not recorded for a variable, occurs in almost all statistical analyses and may be caused by many reasons, such as lack of collection or a lack of documentation. Researchers need to adequately deal with this issue to provide a valid analysis. The visualization of missing values plays an important role in supporting the investigation and understanding of the missing data patterns. While some techniques and tools for visualization of missing values are available, it is still a challenge to select the right visualization that will fulfil the user requirements for visualizing missing data. This paper provides an overview and state-of-the-art report (STAR) of research literature focusing on missing values visualization. To the best of our knowledge, this is the first survey paper with a focus on missing data visualization. The goal of this paper is to encourage visualization researchers to increase their involvement with Missing data visualization.

Visualization of missing data: a state-of-the-art survey

TL;DR

Missing data visualization is underexplored but critical for understanding data quality and guiding imputation. The authors conduct a systematic literature review, propose a comprehensive taxonomy across data types, interactivity, and tasks, and categorize existing work into techniques, applications/tools, and evaluations. Key contributions include a state-of-the-art synthesis, the MissVisG/MissVis visualizations, nabular data concepts, and a structured overview of tools and evaluations, plus guidance on future research. The findings underscore the need for broader data-type support, realistic evaluations, and tighter integration with imputation workflows to improve practical data analysis.

Abstract

Missing data, the data value that is not recorded for a variable, occurs in almost all statistical analyses and may be caused by many reasons, such as lack of collection or a lack of documentation. Researchers need to adequately deal with this issue to provide a valid analysis. The visualization of missing values plays an important role in supporting the investigation and understanding of the missing data patterns. While some techniques and tools for visualization of missing values are available, it is still a challenge to select the right visualization that will fulfil the user requirements for visualizing missing data. This paper provides an overview and state-of-the-art report (STAR) of research literature focusing on missing values visualization. To the best of our knowledge, this is the first survey paper with a focus on missing data visualization. The goal of this paper is to encourage visualization researchers to increase their involvement with Missing data visualization.
Paper Structure (15 sections, 25 figures, 4 tables)

This paper contains 15 sections, 25 figures, 4 tables.

Figures (25)

  • Figure 1: Two grey squares of equal luminance are placed on a background of varying luminance, the square on the lighter background appears darker and the square on the darker background appears lighter (Figure from Twiddy1994).
  • Figure 2: The user interface of Amelia View Honaker2011.
  • Figure 3: Missingness map in Amelia Honaker2011.
  • Figure 4: Plot of the missing-values patterns in college data Valero-Mora2019.
  • Figure 5: Diabetes data set with a lasagna plot Jimenez2022.
  • ...and 20 more figures