Table of Contents
Fetching ...

What Makes a Visualization Image Complex?

Mengdi Chu, Zefeng Qiu, Meng Ling, Shuning Jiang, Robert S. Laramee, Michael Sedlmair, Jian Chen

TL;DR

This paper addresses how perceived visual complexity in data visualizations can be quantified with objective image-based metrics. It introduces a large crowdsourced VisComplexity2K dataset and a metric-based attribution framework, including two novel object-based metrics MeC and TiR, to relate image properties to VC. Through a PLS-based analysis, it shows that both low-level (edge density, feature points) and high-level (object count, text) features matter, with a non-linear TiR effect and strong predictive power from feature congestion in certain stimulus families; network diagrams show particularly strong edge-based contributions. The work advances explainable VC modeling and provides a dataset and code to enable future metric-driven design and AI-assisted evaluation of visualization complexity.

Abstract

We investigate the perceived visual complexity (VC) in data visualizations using objective image-based metrics. We collected VC scores through a large-scale crowdsourcing experiment involving 349 participants and 1,800 visualization images. We then examined how these scores align with 12 image-based metrics spanning information-theoretic, clutter, color, and our two object-based metrics. Our results show that both low-level image properties and the high-level elements affect perceived VC in visualization images; The number of corners and distinct colors are robust metrics across visualizations. Second, feature congestion, an information-theoretic metric capturing statistical patterns in color and texture, is the strongest predictor of perceived complexity in visualizations rich in the same stimuli; edge density effectively explains VC in node-link diagrams. Additionally, we observe a bell-curve effect for text annotations: increasing text-to-ink ratio (TiR) initially reduces complexity, reaching an optimal point, beyond which further text increases perceived complexity. Our quantification pipeline is also interpretable, enabling metric-based explanations, grounded in the VisComplexity2K dataset, bridging computational metrics with human perceptual responses. osf.io/5xe8a has the preregistration and osf.io/bdet6 has the VisComplexity2K dataset, source code, and all Apdx. and figures.

What Makes a Visualization Image Complex?

TL;DR

This paper addresses how perceived visual complexity in data visualizations can be quantified with objective image-based metrics. It introduces a large crowdsourced VisComplexity2K dataset and a metric-based attribution framework, including two novel object-based metrics MeC and TiR, to relate image properties to VC. Through a PLS-based analysis, it shows that both low-level (edge density, feature points) and high-level (object count, text) features matter, with a non-linear TiR effect and strong predictive power from feature congestion in certain stimulus families; network diagrams show particularly strong edge-based contributions. The work advances explainable VC modeling and provides a dataset and code to enable future metric-driven design and AI-assisted evaluation of visualization complexity.

Abstract

We investigate the perceived visual complexity (VC) in data visualizations using objective image-based metrics. We collected VC scores through a large-scale crowdsourcing experiment involving 349 participants and 1,800 visualization images. We then examined how these scores align with 12 image-based metrics spanning information-theoretic, clutter, color, and our two object-based metrics. Our results show that both low-level image properties and the high-level elements affect perceived VC in visualization images; The number of corners and distinct colors are robust metrics across visualizations. Second, feature congestion, an information-theoretic metric capturing statistical patterns in color and texture, is the strongest predictor of perceived complexity in visualizations rich in the same stimuli; edge density effectively explains VC in node-link diagrams. Additionally, we observe a bell-curve effect for text annotations: increasing text-to-ink ratio (TiR) initially reduces complexity, reaching an optimal point, beyond which further text increases perceived complexity. Our quantification pipeline is also interpretable, enabling metric-based explanations, grounded in the VisComplexity2K dataset, bridging computational metrics with human perceptual responses. osf.io/5xe8a has the preregistration and osf.io/bdet6 has the VisComplexity2K dataset, source code, and all Apdx. and figures.

Paper Structure

This paper contains 37 sections, 25 figures, 7 tables.

Figures (25)

  • Figure 1: Overview of our process to study perceived visual complexity. We used objective quality metric to measure the subjective high-order perception. Activities (in gray) and outcomes (in green).
  • Figure 2: Resulting visual complexity scored image examples (least to most from top left to bottom right).
  • Figure 3: An O.MeC calculation pipeline for an image with a continuous color-map. Left: the original image with 41 measured H-S heer2012color namable colors; Right: 5 colors grouped by H-S similarity to produce O.MeC=5.
  • Figure 4: Objective metric scores in response to the visualization image input. Each row shows the original image followed by visual representations of the 12 objective metrics, with corresponding metric scores shown as raw (normalized) values below. The heatmaps highlight image regions where each metric responds most strongly—brighter and redder areas indicate higher values or more visually complex regions.
  • Figure 5: Factoring visual complexity. The relative contribution of each image metric is represented by the magnitude of its corresponding regression coefficients, modeled using PLS, with the significant ones highlighted in solid color (metrics significant at the 0.05 level).
  • ...and 20 more figures