Table of Contents
Fetching ...

SimpleSets: Capturing Categorical Point Patterns with Simple Shapes

Steven van den Broek, Wouter Meulemans, Bettina Speckmann

TL;DR

SimpleSets tackles the challenge of visualizing categorical point data by using simple enclosing shapes to capture spatial patterns. It introduces islands (convex clusters) and banks (bounded-bend polylines) as the building blocks, and a two-phase pipeline that partitions data and then renders enclosing shapes with a careful stacking and overlap-resolution strategy. The core contributions are formal definitions of simple patterns, a greedy partitioning algorithm with regularity and overlap delays, and a drawing method based on Minkowski-dilated patterns, local stacking, and curve-modification to produce aesthetically pleasing, low-distortion set visualizations. Evaluations on standard benchmarks show that SimpleSets often outperforms existing hull- and Voronoi-based methods in cognitive-load and distortion measures, with open-source code enabling reproducibility and further exploration.

Abstract

Points of interest on a map such as restaurants, hotels, or subway stations, give rise to categorical point data: data that have a fixed location and one or more categorical attributes. Consequently, recent years have seen various set visualization approaches that visually connect points of the same category to support users in understanding the spatial distribution of categories. Existing methods use complex and often highly irregular shapes to connect points of the same category, leading to high cognitive load for the user. In this paper we introduce SimpleSets, which uses simple shapes to enclose categorical point patterns, thereby providing a clean overview of the data distribution. SimpleSets is designed to visualize sets of points with a single categorical attribute; as a result, the point patterns enclosed by SimpleSets form a partition of the data. We give formal definitions of point patterns that correspond to simple shapes and describe an algorithm that partitions categorical points into few such patterns. Our second contribution is a rendering algorithm that transforms a given partition into a clean set of shapes resulting in an aesthetically pleasing set visualization. Our algorithm pays particular attention to resolving intersections between nearby shapes in a consistent manner. We compare SimpleSets to the state-of-the-art set visualizations using standard datasets from the literature.

SimpleSets: Capturing Categorical Point Patterns with Simple Shapes

TL;DR

SimpleSets tackles the challenge of visualizing categorical point data by using simple enclosing shapes to capture spatial patterns. It introduces islands (convex clusters) and banks (bounded-bend polylines) as the building blocks, and a two-phase pipeline that partitions data and then renders enclosing shapes with a careful stacking and overlap-resolution strategy. The core contributions are formal definitions of simple patterns, a greedy partitioning algorithm with regularity and overlap delays, and a drawing method based on Minkowski-dilated patterns, local stacking, and curve-modification to produce aesthetically pleasing, low-distortion set visualizations. Evaluations on standard benchmarks show that SimpleSets often outperforms existing hull- and Voronoi-based methods in cognitive-load and distortion measures, with open-source code enabling reproducibility and further exploration.

Abstract

Points of interest on a map such as restaurants, hotels, or subway stations, give rise to categorical point data: data that have a fixed location and one or more categorical attributes. Consequently, recent years have seen various set visualization approaches that visually connect points of the same category to support users in understanding the spatial distribution of categories. Existing methods use complex and often highly irregular shapes to connect points of the same category, leading to high cognitive load for the user. In this paper we introduce SimpleSets, which uses simple shapes to enclose categorical point patterns, thereby providing a clean overview of the data distribution. SimpleSets is designed to visualize sets of points with a single categorical attribute; as a result, the point patterns enclosed by SimpleSets form a partition of the data. We give formal definitions of point patterns that correspond to simple shapes and describe an algorithm that partitions categorical points into few such patterns. Our second contribution is a rendering algorithm that transforms a given partition into a clean set of shapes resulting in an aesthetically pleasing set visualization. Our algorithm pays particular attention to resolving intersections between nearby shapes in a consistent manner. We compare SimpleSets to the state-of-the-art set visualizations using standard datasets from the literature.
Paper Structure (11 sections, 1 equation, 29 figures, 1 table, 1 algorithm)

This paper contains 11 sections, 1 equation, 29 figures, 1 table, 1 algorithm.

Figures (29)

  • Figure 1: A comparison of visualizations on a common benchmark dataset originating from the paper that introduced Bubble Sets BubbleSets. The points show hotels (blue), subway entrances (red), and medical clinics (green) in lower Manhattan.
  • Figure 1: Modification of dilated patterns to expose points beneath. Figures (b)--(e) show a closeup of the component $C$ in (a) and (f).
  • Figure 2: Figure (c) uses arguably more complex shapes than (b) resulting in higher cognitive load (\ref{['c:cognitive']}). Figures (b) and (c), compared to (a), use fewer and larger shapes resulting in better continuation (\ref{['c:continuation']}) but more obfuscation (\ref{['c:obfuscation']}) and distortion (\ref{['c:distortion']}). Figure (d) distorts point position compared to (a); the expected point position is at the centroid of a shape.
  • Figure 2: We modify the top orange pattern to expose the bottom two points by cutting the disks in $X$ out separately, even though they are close, because their intersection with the boundary of the top pattern is a circular arc. (a) Overlap. (b) Cutting out the convex hull of the two disks in $X$. (c) Cutting out the disks in $X$ separately, our preferred solution.
  • Figure 3: SimpleSets pipeline. From left to right: data points; partition into patterns; dilated patterns; final enclosing shapes.
  • ...and 24 more figures