Table of Contents
Fetching ...

Interpretable contour level selection for heat maps for gridded data

Tarn Duong

TL;DR

Density contour levels offer a probabilistically meaningful way to visualize gridded data via heat maps, but standard density contour estimation requires point data. The authors develop a grid-based approximation that computes $p_j = \delta \hat f(\mathbf g_j)$ on a grid $G$ to obtain $\hat R_\tau$ and $\hat f_\tau$, and extend the method to non-density grid functions using $g^+$ and $g^-$. Through synthetic mixtures and real gridded datasets (e.g., Paris population density and Western Australia temperature anomalies), they show that density-contour visualizations provide superior interpretability and robustness compared to naive quantile, equal-length, and natural contours, with lower symmetric-difference errors. The approach is implemented in open-source R packages, enabling wider adoption for confidential or high-volume gridded data visualization and decision support.

Abstract

Gridded data formats, where the observed multivariate data are aggregated into grid cells, ensure confidentiality and reduce storage requirements, with the trade-off that access to the underlying point data is lost. Heat maps are a highly pertinent visualisation for gridded data, and heat maps with a small number of well-selected contour levels offer improved interpretability over continuous contour levels. There are many possible contour level choices. Amongst them, density contour levels are highly suitable in many cases. Current methods for computing density contour levels requires access to the observed point data, so they are not applicable to gridded data. To remedy this, we introduce an approximation of density contour levels for gridded data. We then compare our proposed method to existing contour level selection methods, and conclude that our proposal provides improved interpretability for synthetic and experimental gridded data.

Interpretable contour level selection for heat maps for gridded data

TL;DR

Density contour levels offer a probabilistically meaningful way to visualize gridded data via heat maps, but standard density contour estimation requires point data. The authors develop a grid-based approximation that computes on a grid to obtain and , and extend the method to non-density grid functions using and . Through synthetic mixtures and real gridded datasets (e.g., Paris population density and Western Australia temperature anomalies), they show that density-contour visualizations provide superior interpretability and robustness compared to naive quantile, equal-length, and natural contours, with lower symmetric-difference errors. The approach is implemented in open-source R packages, enabling wider adoption for confidential or high-volume gridded data visualization and decision support.

Abstract

Gridded data formats, where the observed multivariate data are aggregated into grid cells, ensure confidentiality and reduce storage requirements, with the trade-off that access to the underlying point data is lost. Heat maps are a highly pertinent visualisation for gridded data, and heat maps with a small number of well-selected contour levels offer improved interpretability over continuous contour levels. There are many possible contour level choices. Amongst them, density contour levels are highly suitable in many cases. Current methods for computing density contour levels requires access to the observed point data, so they are not applicable to gridded data. To remedy this, we introduce an approximation of density contour levels for gridded data. We then compare our proposed method to existing contour level selection methods, and conclude that our proposal provides improved interpretability for synthetic and experimental gridded data.

Paper Structure

This paper contains 6 sections, 2 equations, 8 figures, 4 tables, 2 algorithms.

Figures (8)

  • Figure 1: Heat map for geospatial gridded data. Population of the Paris city region on 1 km $\times$ 1 km grid, with continuous sequential 'Heat' colour scale.
  • Figure 2: Heat map for non-geospatial gridded data. Year-monthly surface temperature anomaly time series in Western Australia, with continuous diverging 'Red-Blue' colour scale.
  • Figure 3: Gaussian mixture density $4/11 N((-1,1), 1/8[1, 0; 0, 1]) + 3/11 N((0,0), 1/8[1, 9/10; 9/10, 1]) + 4/11 N((1,-1), 1/8[1, 0; 0, 1])$. (a) Heat map with continuous 'Heat' colour scale, red (high) to orange (mid) to yellow (low). (b) Heat map with decile contour levels.
  • Figure 4: Comparison of contour level selection methods for gridded density estimates from mixture densities, with discretised sequential 'Heat' colour scale, with $n=10\,0000$ sample. (First row) Target proxy contour. (Second row) Density contour. (Third row) Naive quantile. (Fourth row) Equal length. (Fifth row) Natural (Jenks).
  • Figure 5: Contour regions for population density for Paris capital region on 1 km $\times$ 1 km grid, with discretised sequential 'Heat' colour scale. (a) Density contour levels. (b) Naive quantile contour levels.
  • ...and 3 more figures