Table of Contents
Fetching ...

Cycles Communities from the Perspective of Dendrograms and Gradient Sampling

Sixtus Dakurah

TL;DR

This work tackles the challenge of identifying and matching cycle structures across topological objects by introducing two complementary frameworks. The first framework builds dendrogram representations of homology via merge-tree algorithms and compares them with a Wasserstein distance, enabling hierarchical clustering and statistical analysis of cycle communities. The second framework extends Stratified Gradient Sampling to learn multiple cycle-barycenter filter functions, producing non-overlapping cycle communities through topological registration. Together, these approaches offer both a descriptive and a constructive toolkit for analyzing cycle organization in complex networks, with broad potential applications in neuroscience and network science. The paper also outlines future directions for integrating these methods with additional topological tools to deepen insights into diffusion, spectral properties, and harmonic structures on graphs.

Abstract

Identifying and comparing topological features, particularly cycles, across different topological objects remains a fundamental challenge in persistent homology and topological data analysis. This work introduces a novel framework for constructing cycle communities through two complementary approaches. First, a dendrogram-based methodology leverages merge-tree algorithms to construct hierarchical representations of homology classes from persistence intervals. The Wasserstein distance on merge trees is introduced as a metric for comparing dendrograms, establishing connections to hierarchical clustering frameworks. Through simulation studies, the discriminative power of dendrogram representations for identifying cycle communities is demonstrated. Second, an extension of Stratified Gradient Sampling simultaneously learns multiple filter functions that yield cycle barycenter functions capable of faithfully reconstructing distinct sets of cycles. The set of cycles each filter function can reconstruct constitutes cycle communities that are non-overlapping and partition the space of all cycles. Together, these approaches transform the problem of cycle matching into both a hierarchical clustering and topological optimization framework, providing principled methods to identify similar topological structures both within and across groups of topological objects.

Cycles Communities from the Perspective of Dendrograms and Gradient Sampling

TL;DR

This work tackles the challenge of identifying and matching cycle structures across topological objects by introducing two complementary frameworks. The first framework builds dendrogram representations of homology via merge-tree algorithms and compares them with a Wasserstein distance, enabling hierarchical clustering and statistical analysis of cycle communities. The second framework extends Stratified Gradient Sampling to learn multiple cycle-barycenter filter functions, producing non-overlapping cycle communities through topological registration. Together, these approaches offer both a descriptive and a constructive toolkit for analyzing cycle organization in complex networks, with broad potential applications in neuroscience and network science. The paper also outlines future directions for integrating these methods with additional topological tools to deepen insights into diffusion, spectral properties, and harmonic structures on graphs.

Abstract

Identifying and comparing topological features, particularly cycles, across different topological objects remains a fundamental challenge in persistent homology and topological data analysis. This work introduces a novel framework for constructing cycle communities through two complementary approaches. First, a dendrogram-based methodology leverages merge-tree algorithms to construct hierarchical representations of homology classes from persistence intervals. The Wasserstein distance on merge trees is introduced as a metric for comparing dendrograms, establishing connections to hierarchical clustering frameworks. Through simulation studies, the discriminative power of dendrogram representations for identifying cycle communities is demonstrated. Second, an extension of Stratified Gradient Sampling simultaneously learns multiple filter functions that yield cycle barycenter functions capable of faithfully reconstructing distinct sets of cycles. The set of cycles each filter function can reconstruct constitutes cycle communities that are non-overlapping and partition the space of all cycles. Together, these approaches transform the problem of cycle matching into both a hierarchical clustering and topological optimization framework, providing principled methods to identify similar topological structures both within and across groups of topological objects.

Paper Structure

This paper contains 22 sections, 1 theorem, 10 equations, 11 figures, 1 table, 1 algorithm.

Key Result

Proposition 1

Let $D_1$ and $D_2$ be two sets of barcodes defined according to (eqn:barcodes). The 2-Wasserstein distance between $D_1$ and $D_2$ admits the simplified form where $d^1_{(j)}$ and $d^2_{(j)}$ are the $j$-th ordered values of the death times in $D_1$ and $D_2$ respectively.

Figures (11)

  • Figure 1: An illustration of the filtration on a 1-dimensional simplicial complex. A dashed red or blue line indicates an edge that has been deleted. From top-left, the full simplicial complex which is sequentially thresholded to the point set(top-right). Bottom-left, the non-increasing count of the number of 1-cycles/loops. Bottom-right, the non-decreasing count of the number of connected components.
  • Figure 2: Illustration of the correspondence between the filtration and the dendrogram. Left-top, the five node network. The first two filtration values $d_1 = 0.25, d_2 = 0.3$ destroys the two independent cycles and are not part of the birth set. At the third filtration value denoted $b_1 = 0.35$, two connected components are created when edges with weight at or less than $b_1$ are deleted. The two blue lines in the dendrogram, identifies the two connected components (clusters). The process continues sequentially until we get to the highest filtration value $b_4 = 0.55$, resulting in five connected components (clusters), the node set of the network.
  • Figure 3: The network and its decomposition. (a) The fully-connected network. (b) The maximum spanning tree. (c) The non-maximum spanning tree.
  • Figure 4: The dendrograms for the cycles and connected components for the network in Figure \ref{['fig:full']}. (a) The dendrogram for cycles. The x-axis represents the filtration values (death values). (b) The dendrogram for connected components. The x-axis represents the filtration values (birth values).
  • Figure 5: A fully connected five-node network with six $1$-cycles. The $1$-cycle space is identified by the red-colored edges.
  • ...and 6 more figures

Theorems & Definitions (1)

  • Proposition 1