Visualization of missing data: a state-of-the-art survey
Sarah Alsufyani, Matthew Forshaw, Sara Johansson Fernstad
TL;DR
Missing data visualization is underexplored but critical for understanding data quality and guiding imputation. The authors conduct a systematic literature review, propose a comprehensive taxonomy across data types, interactivity, and tasks, and categorize existing work into techniques, applications/tools, and evaluations. Key contributions include a state-of-the-art synthesis, the MissVisG/MissVis visualizations, nabular data concepts, and a structured overview of tools and evaluations, plus guidance on future research. The findings underscore the need for broader data-type support, realistic evaluations, and tighter integration with imputation workflows to improve practical data analysis.
Abstract
Missing data, the data value that is not recorded for a variable, occurs in almost all statistical analyses and may be caused by many reasons, such as lack of collection or a lack of documentation. Researchers need to adequately deal with this issue to provide a valid analysis. The visualization of missing values plays an important role in supporting the investigation and understanding of the missing data patterns. While some techniques and tools for visualization of missing values are available, it is still a challenge to select the right visualization that will fulfil the user requirements for visualizing missing data. This paper provides an overview and state-of-the-art report (STAR) of research literature focusing on missing values visualization. To the best of our knowledge, this is the first survey paper with a focus on missing data visualization. The goal of this paper is to encourage visualization researchers to increase their involvement with Missing data visualization.
