Table of Contents
Fetching ...

Size Should not Matter: Scale-invariant Stress Metrics

Reyan Ahmed, Cesim Erten, Stephen Kobourov, Jonah Lotz, Jacob Miller, Hamlet Taraz

TL;DR

The paper tackles the problem that widely used stress metrics for graph drawings are often scale-sensitive, which can mislead comparisons across algorithms. It analyzes eight metrics, derives closed-form, scale-minimizing variants, and demonstrates that scale-invariant metrics, particularly Scale-normalized Stress (SNS), align with intuitive and ground-truth layout quality across two diverse graph sets. Through experiments on Rome-Lib and SuiteSparse datasets using Neato, SFDP, and Random layouts, the authors show that scale-sensitive metrics can produce incorrect orderings, while SNS reliably recovers the expected ranking and remains fast to compute. They advocate adopting scale-normalized stress as the standard metric for fair stress-based evaluation, provide open-source implementations, and discuss limitations and directions for future work including re-evaluating prior studies with scale-invariant metrics.

Abstract

The normalized stress metric measures how closely distances between vertices in a graph drawing match the graph-theoretic distances between those vertices. It is one of the most widely employed quality metrics for graph drawing, and is even the optimization goal of several popular graph layout algorithms. However, normalized stress can be misleading when used to compare the outputs of two or more algorithms, as it is sensitive to the size of the drawing compared to the graph-theoretic distances used. Uniformly scaling a layout will change the value of stress despite not meaningfully changing the drawing. In fact, the change in stress values can be so significant that a clearly better layout can appear to have a worse stress score than a random layout. In this paper, we study different variants for calculating stress used in the literature (raw stress, normalized stress, etc.) and show that many of them are affected by this problem, which threatens the validity of experiments that compare the quality of one algorithm to that of another. We then experimentally justify one of the stress calculation variants, scale-normalized stress, as one that fairly compares drawing outputs regardless of their size. We also describe an efficient computation for scale-normalized stress and provide an open source implementation.

Size Should not Matter: Scale-invariant Stress Metrics

TL;DR

The paper tackles the problem that widely used stress metrics for graph drawings are often scale-sensitive, which can mislead comparisons across algorithms. It analyzes eight metrics, derives closed-form, scale-minimizing variants, and demonstrates that scale-invariant metrics, particularly Scale-normalized Stress (SNS), align with intuitive and ground-truth layout quality across two diverse graph sets. Through experiments on Rome-Lib and SuiteSparse datasets using Neato, SFDP, and Random layouts, the authors show that scale-sensitive metrics can produce incorrect orderings, while SNS reliably recovers the expected ranking and remains fast to compute. They advocate adopting scale-normalized stress as the standard metric for fair stress-based evaluation, provide open-source implementations, and discuss limitations and directions for future work including re-evaluating prior studies with scale-invariant metrics.

Abstract

The normalized stress metric measures how closely distances between vertices in a graph drawing match the graph-theoretic distances between those vertices. It is one of the most widely employed quality metrics for graph drawing, and is even the optimization goal of several popular graph layout algorithms. However, normalized stress can be misleading when used to compare the outputs of two or more algorithms, as it is sensitive to the size of the drawing compared to the graph-theoretic distances used. Uniformly scaling a layout will change the value of stress despite not meaningfully changing the drawing. In fact, the change in stress values can be so significant that a clearly better layout can appear to have a worse stress score than a random layout. In this paper, we study different variants for calculating stress used in the literature (raw stress, normalized stress, etc.) and show that many of them are affected by this problem, which threatens the validity of experiments that compare the quality of one algorithm to that of another. We then experimentally justify one of the stress calculation variants, scale-normalized stress, as one that fairly compares drawing outputs regardless of their size. We also describe an efficient computation for scale-normalized stress and provide an open source implementation.
Paper Structure (15 sections, 18 equations, 8 figures, 7 tables)

This paper contains 15 sections, 18 equations, 8 figures, 7 tables.

Figures (8)

  • Figure 1: (Top) The dwt_419 graph drawn by Neato ellson2001graphviz, SFDP ellson2001graphviz, and random (a random assignment of node positions). (Bottom) The plot shows how the values of normalized stress change as one scales (stretches or shrinks) the drawings. Clearly the metric is sensitive to scale and further, there are large regions of the plot where the random algorithm has the lowest value. By not accounting for scale, one is essentially selecting an arbitrary point on each curve to compare the algorithms, allowing one to obtain any desired order.
  • Figure 2: Changing the size of a drawing does not change any salient features, shown here at half the size (left), the default size (middle) and 10 times the size (right). The pen width was left unadjusted for illustration.
  • Figure 3: Plots of the scale-sensitive metrics as one varies the size of the drawing. Each curve is derived from the small grid graph shown in Figure \ref{['fig:scale-illust']}. Each curve is very sensitive to changes in scale; just doubling the size of the drawing increases the value of stress by an order of magnitude.
  • Figure 4: Two drawings and the accompanying Shepard diagrams for graph grafo2617. The top, stress-based drawing by Neato places more points of the Shepard diagram closer to the diagonal than the Random drawing. This is reflected in the correlation coefficients: 0.9 for Neato and 0.1 for Random.
  • Figure 5: Illustration of the first step of computing non-metric stress. From a drawing (left), a Shepard diagram is computed (right) and a monotonic regression is performed on it's components (the orange line). Configurations in which the average horizontal distance from the blue points to the orange line is small have lower non-metric stress.
  • ...and 3 more figures