What Makes a Visualization Image Complex?
Mengdi Chu, Zefeng Qiu, Meng Ling, Shuning Jiang, Robert S. Laramee, Michael Sedlmair, Jian Chen
TL;DR
This paper addresses how perceived visual complexity in data visualizations can be quantified with objective image-based metrics. It introduces a large crowdsourced VisComplexity2K dataset and a metric-based attribution framework, including two novel object-based metrics MeC and TiR, to relate image properties to VC. Through a PLS-based analysis, it shows that both low-level (edge density, feature points) and high-level (object count, text) features matter, with a non-linear TiR effect and strong predictive power from feature congestion in certain stimulus families; network diagrams show particularly strong edge-based contributions. The work advances explainable VC modeling and provides a dataset and code to enable future metric-driven design and AI-assisted evaluation of visualization complexity.
Abstract
We investigate the perceived visual complexity (VC) in data visualizations using objective image-based metrics. We collected VC scores through a large-scale crowdsourcing experiment involving 349 participants and 1,800 visualization images. We then examined how these scores align with 12 image-based metrics spanning information-theoretic, clutter, color, and our two object-based metrics. Our results show that both low-level image properties and the high-level elements affect perceived VC in visualization images; The number of corners and distinct colors are robust metrics across visualizations. Second, feature congestion, an information-theoretic metric capturing statistical patterns in color and texture, is the strongest predictor of perceived complexity in visualizations rich in the same stimuli; edge density effectively explains VC in node-link diagrams. Additionally, we observe a bell-curve effect for text annotations: increasing text-to-ink ratio (TiR) initially reduces complexity, reaching an optimal point, beyond which further text increases perceived complexity. Our quantification pipeline is also interpretable, enabling metric-based explanations, grounded in the VisComplexity2K dataset, bridging computational metrics with human perceptual responses. osf.io/5xe8a has the preregistration and osf.io/bdet6 has the VisComplexity2K dataset, source code, and all Apdx. and figures.
