Table of Contents
Fetching ...

Legible Label Layout for Data Visualization, Algorithm and Integration into Vega-Lite

Chanwut Kittivorawong

TL;DR

This work tackles legible labeling in data visualizations by introducing occupancy bitmaps to accelerate overlap detection, enabling fast, greedy label placement across diverse chart types. It develops a bitmap-based labeling pipeline and demonstrates significant runtime improvements over Particle-Based Labeling while preserving label density, then embeds the algorithm into Vega via a label transform and into Vega-Lite via a label encoding channel. The contributions include a scalable data structure for overlap checks, the adaptation of the approach to scatter, line, area, and map charts (with stacked-area adaptations), and a cohesive integration path for both Vega and Vega-Lite. The practical impact is enabling interactive, label-rich visualizations in high-level grammars without reimplementing labeling logic, with future work aimed at more chart types and smoother interactivity during zooming and panning.

Abstract

Legible labels should not overlap with other labels and other marks in a chart. When a chart contains a large number of data points, manually positioning these labels for each data point in the chart is a tedious task. A labeling algorithm is necessary to automatically layout the labels for a chart with a large number of data points. The state-of-the-art labeling algorithm detects overlaps using a set of points to approximate each mark's shape. This approach is inefficient for large marks or many marks as it requires too many points to detect overlaps. In response, we present a bitmap-based label placement algorithm, which leverages an occupancy bitmap to accelerate overlap detection. To create an occupancy bitmap, we rasterize marks onto a bitmap based on the area they occupy in the chart. With the bitmap, we can efficiently place labels without overlapping existing marks, regardless of the number and geometric complexity of the marks. This bitmap-based algorithm offers significant performance improvements over the state-of-the-art approach while placing a similar number of labels. We also integrate this algorithm into Vega-Lite as one of its encoding channels, label encoding. Label encoding allows users to encode fields in each data point with a text label to annotate the mark that represents the data point in a chart.

Legible Label Layout for Data Visualization, Algorithm and Integration into Vega-Lite

TL;DR

This work tackles legible labeling in data visualizations by introducing occupancy bitmaps to accelerate overlap detection, enabling fast, greedy label placement across diverse chart types. It develops a bitmap-based labeling pipeline and demonstrates significant runtime improvements over Particle-Based Labeling while preserving label density, then embeds the algorithm into Vega via a label transform and into Vega-Lite via a label encoding channel. The contributions include a scalable data structure for overlap checks, the adaptation of the approach to scatter, line, area, and map charts (with stacked-area adaptations), and a cohesive integration path for both Vega and Vega-Lite. The practical impact is enabling interactive, label-rich visualizations in high-level grammars without reimplementing labeling logic, with future work aimed at more chart types and smoother interactivity during zooming and panning.

Abstract

Legible labels should not overlap with other labels and other marks in a chart. When a chart contains a large number of data points, manually positioning these labels for each data point in the chart is a tedious task. A labeling algorithm is necessary to automatically layout the labels for a chart with a large number of data points. The state-of-the-art labeling algorithm detects overlaps using a set of points to approximate each mark's shape. This approach is inefficient for large marks or many marks as it requires too many points to detect overlaps. In response, we present a bitmap-based label placement algorithm, which leverages an occupancy bitmap to accelerate overlap detection. To create an occupancy bitmap, we rasterize marks onto a bitmap based on the area they occupy in the chart. With the bitmap, we can efficiently place labels without overlapping existing marks, regardless of the number and geometric complexity of the marks. This bitmap-based algorithm offers significant performance improvements over the state-of-the-art approach while placing a similar number of labels. We also integrate this algorithm into Vega-Lite as one of its encoding channels, label encoding. Label encoding allows users to encode fields in each data point with a text label to annotate the mark that represents the data point in a chart.
Paper Structure (22 sections, 10 figures)

This paper contains 22 sections, 10 figures.

Figures (10)

  • Figure 1: (Left) We rasterize the connected scatter plot onto the bitmap to mark occupied pixels, shown in orange. (Middle) We use the 8-position model imhof1975positioning to generate candidate positions for label placements. The cyan positions are available, while the red ones are not. (Right) After placing the label "1975", the pixels under the label need to be marked as occupied.
  • Figure 2: The black indices indicate the x/y coordinate of pixels in the chart. The red indices indicate the indices of the underlying array of the bitmap. For demonstration, the bitmap is implemented on an array of 4-bit integers, each representing a bit-string of length 4. The blue circles are marking occupied pixels. The yellow box is the area to lookup or update.
  • Figure 3: (Left) Labeled connected scatter plot. (Right) A snapshot of the bitmap when labeling the connected scatter plot. Here, a greedy labeling algorithm already placed labels in the left half of the chart.
  • Figure 4: The labeling results from (A) our bitmap-based labeling and (B) Particle-Based Labeling by Luboschik et al. luboschik:particle. (C) shows the visual difference between (A) and (B). The original Particle-Based Labeling may place a label that overlaps with existing marks by a half pixel. For example, the bounding box of the text's bounding box, as indicated with the red cross in (D), overlaps with a nearby line. Our Improved Particle-Based Labeling algorithm addresses this issue.
  • Figure 5: The runtime and the number of labels placed by the bitmap-based algorithm, the original Particle-Based Labeling algorithm, and the Improved Particle-Based Labeling algorithm. The gray bands show the differences between conditions.
  • ...and 5 more figures