Table of Contents
Fetching ...

PixelatedScatter: Arbitrary-level Visual Abstraction for Large-scale Multiclass Scatterplots

Ziheng Guo, Tianxiang Wei, Zeyu Li, Lianghao Zhang, Sisi Li, Jiawan Zhang

TL;DR

This work tackles overdraw in large-scale multiclass scatterplots by introducing PixelatedScatter, a method that partitions the plot into iso-density regions, applies regional density equalization, and reconstructs data distributions with a pixel-based layout. It balances preservation of relative regional densities with explicit outlier emphasis, enabling faithful representation across arbitrary abstraction levels and HDR data. The approach is validated through quantitative metrics, a user study, and qualitative case studies, showing superior density preservation, robust outlier representation, and strong visual contrast compared to prior methods. The results indicate practical benefits for high-resolution displays and varied data distributions, with potential extensions to interactive and hybrid rendering workflows.

Abstract

Overdraw is inevitable in large-scale scatterplots. Current scatterplot abstraction methods lose features in medium-to-low density regions. We propose a visual abstraction method designed to provide better feature preservation across arbitrary abstraction levels for large-scale scatterplots, particularly in medium-to-low density regions. The method consists of three closely interconnected steps: first, we partition the scatterplot into iso-density regions and equalize visual density; then, we allocate pixels for different classes within each region; finally, we reconstruct the data distribution based on pixels. User studies, quantitative and qualitative evaluations demonstrate that, compared to previous methods, our approach better preserves features and exhibits a special advantage when handling ultra-high dynamic range data distributions.

PixelatedScatter: Arbitrary-level Visual Abstraction for Large-scale Multiclass Scatterplots

TL;DR

This work tackles overdraw in large-scale multiclass scatterplots by introducing PixelatedScatter, a method that partitions the plot into iso-density regions, applies regional density equalization, and reconstructs data distributions with a pixel-based layout. It balances preservation of relative regional densities with explicit outlier emphasis, enabling faithful representation across arbitrary abstraction levels and HDR data. The approach is validated through quantitative metrics, a user study, and qualitative case studies, showing superior density preservation, robust outlier representation, and strong visual contrast compared to prior methods. The results indicate practical benefits for high-resolution displays and varied data distributions, with potential extensions to interactive and hybrid rendering workflows.

Abstract

Overdraw is inevitable in large-scale scatterplots. Current scatterplot abstraction methods lose features in medium-to-low density regions. We propose a visual abstraction method designed to provide better feature preservation across arbitrary abstraction levels for large-scale scatterplots, particularly in medium-to-low density regions. The method consists of three closely interconnected steps: first, we partition the scatterplot into iso-density regions and equalize visual density; then, we allocate pixels for different classes within each region; finally, we reconstruct the data distribution based on pixels. User studies, quantitative and qualitative evaluations demonstrate that, compared to previous methods, our approach better preserves features and exhibits a special advantage when handling ultra-high dynamic range data distributions.

Paper Structure

This paper contains 18 sections, 6 equations, 12 figures, 1 algorithm.

Figures (12)

  • Figure 1: Pipeline of our methods: (a) Original scatterplot; (b) Clusters after iso-density region partitioning, where each color represents a distinct cluster; (c) Splitting each cluster into density-consistent pixels to generate a data density distribution histogram, then equalizing the histogram and using the cumulative distribution function as the mapping to visual density, finally calculating the number of pixels that need coloring within each cluster; (d) Filtering outlier classes based on the class distribution within individual clusters; (e) Allocating the pixel number from (c) to obtain class pixel numbers within a cluster; (f) Constructing an initial pixel layout; (g) Dispersing pixels to produce a non-overlapping layout; (h) Final representation result.
  • Figure 2: Illustration of Iso-density Region Partition: (a) Gridding of the original data; (b) Clusters after initial clustering, with the kurtosis calculated for each cluster; (c) Re-gridding clusters with kurtosis larger than $\theta_{k}$ using a halved grid size; (d) The Final cluster partition result after re-clustering, where the kurtosis of each cluster is less than $\theta_{k}$.
  • Figure 3: Regional Density Distribution Equalization: (a) the original scatterplot; (b) effect after density equalization; (c) density equalization mapping process, where the top section shows the original data distribution, the middle section displays the cumulative distribution function after equalization, and the bottom section presents the distribution of visual density after mapping.
  • Figure 4: Pipeline of Initial Layout Construction: (a) The initial state; (b) Identifying placeable (light blue) and unplaceable (light green) regions; (c, d, e) Iteratively placing the most urgent class after calculating urgent indexes (UI) to complete the initial layout.
  • Figure 5: Pipeline of KD-Tree Guided Pixel Dispersion: (a) The initial overlapping layout; (b) Splitting the region along its longer axis (the y-axis); (c, d) depict the sub-split within the yellow area, and such splits continue until each pixel is assigned to just one grid; (e) The non-overlapping pixel layout.
  • ...and 7 more figures