Table of Contents
Fetching ...

The Hammock Plot: Where Categorical and Numerical Data Relax Together

Matthias Schonlau, Tiancheng Yang

TL;DR

The paper addresses the visualization of datasets containing both categorical and numerical variables, where traditional plots struggle to display all information coherently. It presents hammock plots and a Stata implementation hammock, which uses parallel coordinates with univariate bars and connectors to encode univariate and bivariate frequencies, and it discusses features such as highlighting, missing-value treatment, and axis alignment, including the concept of parallel univariate plots as an edge case. It contributes numerous illustrative examples and introduces a publicly available Tour de France 2020 dataset to facilitate testing and demonstration. The work provides a practical, interpretable visualization technique for mixed data and a ready-to-use software tool that broadens data exploration and storytelling capabilities.

Abstract

Effective methods for visualizing data involving multiple variables, including categorical ones, are limited. The hammock plot (Schonlau 2003) visualizes both categorical and numerical variables using parallel coordinates. We introduce the Stata implementation hammock. We give numerous examples that explore highlighting, missing values, putting axes on the same scale, and tracing an observation across variables. Further, we discuss parallel univariate plots as an edge case of hammock plots. We also present and make publicly available a new dataset on the 2020 Tour de France. A graphical abstract is shown below.

The Hammock Plot: Where Categorical and Numerical Data Relax Together

TL;DR

The paper addresses the visualization of datasets containing both categorical and numerical variables, where traditional plots struggle to display all information coherently. It presents hammock plots and a Stata implementation hammock, which uses parallel coordinates with univariate bars and connectors to encode univariate and bivariate frequencies, and it discusses features such as highlighting, missing-value treatment, and axis alignment, including the concept of parallel univariate plots as an edge case. It contributes numerous illustrative examples and introduces a publicly available Tour de France 2020 dataset to facilitate testing and demonstration. The work provides a practical, interpretable visualization technique for mixed data and a ready-to-use software tool that broadens data exploration and storytelling capabilities.

Abstract

Effective methods for visualizing data involving multiple variables, including categorical ones, are limited. The hammock plot (Schonlau 2003) visualizes both categorical and numerical variables using parallel coordinates. We introduce the Stata implementation hammock. We give numerous examples that explore highlighting, missing values, putting axes on the same scale, and tracing an observation across variables. Further, we discuss parallel univariate plots as an edge case of hammock plots. We also present and make publicly available a new dataset on the 2020 Tour de France. A graphical abstract is shown below.

Paper Structure

This paper contains 2 sections, 1 figure.

Figures (1)

  • Figure 1: A scatter plot and a corresponding hammock plot of repair record (1-5 "stars") vs car origin from the auto data. The hammock plot has parallel axes. Univariate bars (on the axes) are proportional to frequency of the corresponding category. Connectors (between the axes) are proportional to the corresponding bivariate frequencies. The frequency of missing connectors (e.g. repair record "1" to "foreign") is zero. With only two axes, the usefulness of the hammock plot may not yet be apparent.