Visual Stenography: Feature Recreation and Preservation in Sketches of Noisy Line Charts
Rifat Ara Proma, Michael Correll, Ghulam Jilani Quadri, Paul Rosen
TL;DR
This study investigates how readers prioritize features in noisy line charts by using visual stenography, a sketch-based task with $9$ datasets across $5$ noise levels. The primary study reveals three behavior patterns—Replicator, Trend Keeper, and De-noiser—and shows that trends and peaks are often preserved faithfully, while periodicity and noise are represented semantically. Feature estimation relies on LOESS and FFT for trends, FFT for periodicity, Topological Data Analysis for peaks/valleys, and high-frequency FFT plus Pixel Approximate Entropy for noise, enabling a robust mixed-methods analysis. A follow-up validation confirms the clustering patterns and suggests practical implications: smoothing is not always necessary, storytelling through annotations can aid interpretation, and visual-query systems should accommodate flexible, human-centric representations of time-series data.
Abstract
Line charts surface many features in time series data, from trends to periodicity to peaks and valleys. However, not every potentially important feature in the data may correspond to a visual feature which readers can detect or prioritize. In this study, we conducted a visual stenography task, where participants re-drew line charts to solicit information about the visual features they believed to be important. We systematically varied noise levels (SNR ~5-30 dB) across line charts to observe how visual clutter influences which features people prioritize in their sketches. We identified three key strategies that correlated with the noise present in the stimuli: the Replicator attempted to retain all major features of the line chart including noise; the Trend Keeper prioritized trends disregarding periodicity and peaks; and the De-noiser filtered out noise while preserving other features. Further, we found that participants tended to faithfully retain trends and peaks and valleys when these features were present, while periodicity and noise were represented in more qualitative or gestural ways: semantically rather than accurately. These results suggest a need to consider more flexible and human-centric ways of presenting, summarizing, pre-processing, or clustering time series data.
