Table of Contents
Fetching ...

Visual Stenography: Feature Recreation and Preservation in Sketches of Noisy Line Charts

Rifat Ara Proma, Michael Correll, Ghulam Jilani Quadri, Paul Rosen

TL;DR

This study investigates how readers prioritize features in noisy line charts by using visual stenography, a sketch-based task with $9$ datasets across $5$ noise levels. The primary study reveals three behavior patterns—Replicator, Trend Keeper, and De-noiser—and shows that trends and peaks are often preserved faithfully, while periodicity and noise are represented semantically. Feature estimation relies on LOESS and FFT for trends, FFT for periodicity, Topological Data Analysis for peaks/valleys, and high-frequency FFT plus Pixel Approximate Entropy for noise, enabling a robust mixed-methods analysis. A follow-up validation confirms the clustering patterns and suggests practical implications: smoothing is not always necessary, storytelling through annotations can aid interpretation, and visual-query systems should accommodate flexible, human-centric representations of time-series data.

Abstract

Line charts surface many features in time series data, from trends to periodicity to peaks and valleys. However, not every potentially important feature in the data may correspond to a visual feature which readers can detect or prioritize. In this study, we conducted a visual stenography task, where participants re-drew line charts to solicit information about the visual features they believed to be important. We systematically varied noise levels (SNR ~5-30 dB) across line charts to observe how visual clutter influences which features people prioritize in their sketches. We identified three key strategies that correlated with the noise present in the stimuli: the Replicator attempted to retain all major features of the line chart including noise; the Trend Keeper prioritized trends disregarding periodicity and peaks; and the De-noiser filtered out noise while preserving other features. Further, we found that participants tended to faithfully retain trends and peaks and valleys when these features were present, while periodicity and noise were represented in more qualitative or gestural ways: semantically rather than accurately. These results suggest a need to consider more flexible and human-centric ways of presenting, summarizing, pre-processing, or clustering time series data.

Visual Stenography: Feature Recreation and Preservation in Sketches of Noisy Line Charts

TL;DR

This study investigates how readers prioritize features in noisy line charts by using visual stenography, a sketch-based task with datasets across noise levels. The primary study reveals three behavior patterns—Replicator, Trend Keeper, and De-noiser—and shows that trends and peaks are often preserved faithfully, while periodicity and noise are represented semantically. Feature estimation relies on LOESS and FFT for trends, FFT for periodicity, Topological Data Analysis for peaks/valleys, and high-frequency FFT plus Pixel Approximate Entropy for noise, enabling a robust mixed-methods analysis. A follow-up validation confirms the clustering patterns and suggests practical implications: smoothing is not always necessary, storytelling through annotations can aid interpretation, and visual-query systems should accommodate flexible, human-centric representations of time-series data.

Abstract

Line charts surface many features in time series data, from trends to periodicity to peaks and valleys. However, not every potentially important feature in the data may correspond to a visual feature which readers can detect or prioritize. In this study, we conducted a visual stenography task, where participants re-drew line charts to solicit information about the visual features they believed to be important. We systematically varied noise levels (SNR ~5-30 dB) across line charts to observe how visual clutter influences which features people prioritize in their sketches. We identified three key strategies that correlated with the noise present in the stimuli: the Replicator attempted to retain all major features of the line chart including noise; the Trend Keeper prioritized trends disregarding periodicity and peaks; and the De-noiser filtered out noise while preserving other features. Further, we found that participants tended to faithfully retain trends and peaks and valleys when these features were present, while periodicity and noise were represented in more qualitative or gestural ways: semantically rather than accurately. These results suggest a need to consider more flexible and human-centric ways of presenting, summarizing, pre-processing, or clustering time series data.

Paper Structure

This paper contains 59 sections, 17 figures, 2 tables.

Figures (17)

  • Figure 1: The illustration on the left showcases the process of visual stenography, where the participants were shown line charts and were tasked to re-draw them on an iPad. The stimuli are in blue, and the participant sketches are in red. (a-c) We compared the sketches with the shown stimuli and identified three behavior pattern clusters---Replicator, Trend Keeper, and De-noiser. (d-f) Further, participants showed general robustness to noise in terms of preserving trends, periodicity, peaks, and valleys in their sketches across different datasets, and (g-i) they tended to represent the periodicity and noisiness of the stimuli semantically (i.e., conceptually) rather than faithful (i.e., accurately) replication in their sketches.
  • Figure 2: Line charts for all nine datasets used in this study.
  • Figure 3: Line charts for all nine datasets with max noise (SNR $\approx$ 5dB).
  • Figure 4: Participant sketches (red) are overlaid on the stimuli (blue) of the Chicago dataset at four different noise levels. Coders evaluated the sketches and assigned them to categories based on retained features.
  • Figure 5: (a) A participant sketch of the Apple dataset with max noise (SNR $\approx$ 5dB), which is not a valid time series, as pointed out by the orange enclosure. (b) After artifact removal, the data is now a valid time series. (c) Data is normalized in order to compare it to the stimuli data.
  • ...and 12 more figures