Unveiling High-dimensional Backstage: A Survey for Reliable Visual Analytics with Dimensionality Reduction

Hyeon Jeon; Hyunwook Lee; Yun-Hsin Kuo; Taehyun Yang; Daniel Archambault; Sungahn Ko; Takanori Fujiwara; Kwan-Liu Ma; Jinwook Seo

Unveiling High-dimensional Backstage: A Survey for Reliable Visual Analytics with Dimensionality Reduction

Hyeon Jeon, Hyunwook Lee, Yun-Hsin Kuo, Taehyun Yang, Daniel Archambault, Sungahn Ko, Takanori Fujiwara, Kwan-Liu Ma, Jinwook Seo

TL;DR

This paper tackles unreliability in visual analytics that rely on dimensionality reduction (DR) by delivering a holistic framework: a detailed workflow model mapping analyst and machine roles across six stages, a taxonomy linking problems, aims, and solutions, and a meta-analysis of 133 studies to reveal landscape patterns. It demonstrates that most work concentrates on creating new DR techniques rather than evaluating or interpreting them, and it documents practical reliability challenges—including overreliance on 2D scatterplots and lack of libraries. The authors validate their findings with eight DR experts and offer actionable guidance, including an interactive browser and a reader-friendly guide to navigate the literature. Collectively, the contributions provide a structured, human-centered roadmap for improving DR-based visual analytics, with implications for researchers and practitioners seeking more reliable, interpretable, and usable visualization tools.

Abstract

Dimensionality reduction (DR) techniques are essential for visually analyzing high-dimensional data. However, visual analytics using DR often face unreliability, stemming from factors such as inherent distortions in DR projections. This unreliability can lead to analytic insights that misrepresent the underlying data, potentially resulting in misguided decisions. To tackle these reliability challenges, we review 133 papers that address the unreliability of visual analytics using DR. Through this review, we contribute (1) a workflow model that describes the interaction between analysts and machines in visual analytics using DR, and (2) a taxonomy that identifies where and why reliability issues arise within the workflow, along with existing solutions for addressing them. Our review reveals ongoing challenges in the field, whose significance and urgency are validated by five expert researchers. This review also finds that the current research landscape is skewed toward developing new DR techniques rather than their interpretation or evaluation, where we discuss how the HCI community can contribute to broadening this focus.

Unveiling High-dimensional Backstage: A Survey for Reliable Visual Analytics with Dimensionality Reduction

TL;DR

Abstract

Paper Structure (91 sections, 4 figures, 4 tables)

This paper contains 91 sections, 4 figures, 4 tables.

Introduction
Related Work
Surveys on Dimensionality Reduction
Surveys on evaluation metrics
Our contribution
Investigation on the Practical Usage of Dimensionality Reduction
Our contribution
Theoretical Models for Visual Analytics Workflow
Our contribution
Protocol
Paper Selection
Survey scope
Procedure
Metadata analysis
Workflow Model and Taxonomy Design
...and 76 more sections

Figures (4)

Figure 1: Procedure of selecting papers (a-e) and their classification based on research fields (f). (a) We first search seed papers published in major visualization and human-computer interaction venues (), and (b) filter out papers that do not fall within our survey scope (). (c) We then extend our paper collection () by screening the related works and backgrounds of the seed papers. (d, e) Finally, by extensively reviewing and filtering out unrelated papers (), we finalize a total of 133 papers. (f) We classify the papers based on their fields. The icons above each arrow represent the authors (, , and ) that are involved in the corresponding step.
Figure 2: The distribution of collected papers over the years. The papers we identify are dominantly incorporated in machine learning (ML) and visualization (VIS) fields. The number of published papers dramatically increases around the early 2000s.
Figure 3: The illustration of our workflow model. The model explains how an Analyst and a Machine interact while conducting visual analytics using DR. Each stage of visual analytics executed by analysts and machines is represented by [patternparam, background-color=applerednormal] red and [patternparam, background-color=applebluenormal] blue rectangles, respectively, and the input and output of each stage are designated by arrows.
Figure 4: Example papers in each cluster we identify by conducting meta-analysis (\ref{['sec:metaanalysis']}). The problems (\ref{['sec:problem']}) that each system resolves are highlighted in bold. Reference papers: [patternparam, background-color=applepinknormal!20] Pioneerjeon22visjoia11tvcg, [patternparam, background-color=appleindigonormal!20] Judgeaupetit14beliv, [patternparam, background-color=applegreennormal!20] Instructorsedlmair13tvcg, [patternparam, background-color=applegrey!20] Explorernam13tvcg, [patternparam, background-color=appleyellownormal!20] Explainerfaust19tvcglespinats11cgf, and [patternparam, background-color=applebrownnormal!20] Architectchatzimparmpas20tvcg.

Unveiling High-dimensional Backstage: A Survey for Reliable Visual Analytics with Dimensionality Reduction

TL;DR

Abstract

Unveiling High-dimensional Backstage: A Survey for Reliable Visual Analytics with Dimensionality Reduction

Authors

TL;DR

Abstract

Table of Contents

Figures (4)