Table of Contents
Fetching ...

The Census-Stub Graph Invariant Descriptor

Matt I. B. Oddo, Stephen Kobourov, Tamara Munzner

TL;DR

The paper addresses the hairball and layout-dependence issues of traditional graph visualization by introducing invariant graph descriptors computed via a BFS-based Census framework. It defines Census-Node, Census-Edge, and Census-Stub to capture node, edge, and stub information, with Census-Stub delivering dramatically higher discriminating power than prior descriptors while maintaining modest storage costs. A comprehensive Graph Atlas Collider evaluation demonstrates Census-Stub’s superior ability to distinguish non-isomorphic graphs, and the authors present new visual encodings, Hop-Census and Census-Census, along with an 81-graph benchmark to validate qualitative performance. The work enables robust, layout-invariant topology analysis and provides open-source tools and data for replication and broader adoption in graph-analysis workflows.

Abstract

An invariant descriptor captures meaningful structural features of networks, useful where traditional visualizations, like node-link views, face challenges like the hairball phenomenon (inscrutable overlap of points and lines). Designing invariant descriptors involves balancing abstraction and information retention, as richer data summaries demand more storage and computational resources. Building on prior work, chiefly the BMatrix -- a matrix descriptor visualized as the invariant network portrait heatmap -- we introduce BFS-Census, a new algorithm computing our Census data structures: Census-Node, Census-Edge, and Census-Stub. Our experiments show Census-Stub, which focuses on stubs (half-edges), has orders of magnitude greater discerning power (ability to tell non-isomorphic graphs apart) than any other descriptor in this study, without a difficult trade-off: the substantial increase in resolution does not come at a commensurate cost in storage space or computation power. We also present new visualizations -- our Hop-Census polylines and Census-Census trajectories -- and evaluate them using real-world graphs, including a sensitivity analysis that shows graph topology change maps to visual Census change.

The Census-Stub Graph Invariant Descriptor

TL;DR

The paper addresses the hairball and layout-dependence issues of traditional graph visualization by introducing invariant graph descriptors computed via a BFS-based Census framework. It defines Census-Node, Census-Edge, and Census-Stub to capture node, edge, and stub information, with Census-Stub delivering dramatically higher discriminating power than prior descriptors while maintaining modest storage costs. A comprehensive Graph Atlas Collider evaluation demonstrates Census-Stub’s superior ability to distinguish non-isomorphic graphs, and the authors present new visual encodings, Hop-Census and Census-Census, along with an 81-graph benchmark to validate qualitative performance. The work enables robust, layout-invariant topology analysis and provides open-source tools and data for replication and broader adoption in graph-analysis workflows.

Abstract

An invariant descriptor captures meaningful structural features of networks, useful where traditional visualizations, like node-link views, face challenges like the hairball phenomenon (inscrutable overlap of points and lines). Designing invariant descriptors involves balancing abstraction and information retention, as richer data summaries demand more storage and computational resources. Building on prior work, chiefly the BMatrix -- a matrix descriptor visualized as the invariant network portrait heatmap -- we introduce BFS-Census, a new algorithm computing our Census data structures: Census-Node, Census-Edge, and Census-Stub. Our experiments show Census-Stub, which focuses on stubs (half-edges), has orders of magnitude greater discerning power (ability to tell non-isomorphic graphs apart) than any other descriptor in this study, without a difficult trade-off: the substantial increase in resolution does not come at a commensurate cost in storage space or computation power. We also present new visualizations -- our Hop-Census polylines and Census-Census trajectories -- and evaluate them using real-world graphs, including a sensitivity analysis that shows graph topology change maps to visual Census change.

Paper Structure

This paper contains 36 sections, 12 figures.

Figures (12)

  • Figure 1: Node-link views of all 38 labeled graphs with 4 nodes. With an unlabeled perspective there are only 6 graphs (groups by isomorphism class).
  • Figure 2: Multiple idioms representing the largest component within the dimacs10-netscience network NetworkScienceMain from the konect.cc repository, a graph we call network-science in this paper. (A) The traditional node-link view, with force-directed placement. (B) Adjacency matrix, another traditional view, with nodes indexed randomly and then sorted by node degree. (C) Degree distribution plot, which is a histogram of the graph's node degree frequency. (D) Compact single-row heatmap of C, with frequency mapped to color. (E) The graph's network portrait Bagrow_2008, where node degree information is encoded in the horizontal axis, each degree's frequency of occurrence is color-encoded as a heatmap, and the vertical axis is a hop-comprehensive generalization of node degree accounting collected through BFS traversals. Note that D is equivalent to the 1-hop row in E.
  • Figure 3: BFS-Census algorithm pseudocode. The concurrent computation of three Census instantiations is differentiated by color: Census-Node in black, Census-Edge in orange, and Census-Stub in red. Both Census-Edge and Census-Stub depend on Census-Node, but not vice versa.
  • Figure 4: Comparison of Census and BMatrix data structures and visual encodings regarding node-specific information loss. (A) Node-link layout of the input graph, with (arbitrary) node labels color-coded. (B) Census data structure (Census-Node) computed from node degrees collection through BFS-Census. (C) Census vectors (invariant descriptors) are visually encoded as node-colored lines that encode node-specific information. (D) Aggregation into a matrix structure, counting frequencies of trajectories meeting at each grid cell, results in the graph’s BMatrix; node-based information is lost. (E) Matrix contents (frequencies) color-encoded as heatmap; node-based information remains lost.
  • Figure 5: Discerning power and storage space results. (A) Collision counts (log10) proportional to the maximum number of collisions per Graph Atlas order. The lower a line is, the better. Results show Census-Stub (shown in red) has orders of magnitude more discerning power than the other descriptors, that the choice of constituent matters the most, and that Census beats BMatrix in all cases. (B) Absolute numbers (log10) rather than proportions. (C) Histograms of invariant descriptor storage space, frequency normalized, each computed from the entire Graph Atlas. Sorted vertically by maximum bytes reached: the lower and narrower the histogram, the better. Unlike the exponential differences in discerning power in A and B, the differences across bytes are linear. While BMatrix-Stub is the most expensive descriptor in terms of storage size, Census-Stub is not.
  • ...and 7 more figures