Table of Contents
Fetching ...

Distance-Based Hierarchical Cutting of Complex Networks with Non-Preferential and Preferential Choice of Seeds

Alexandre Benatti, Luciano da F. Costa

TL;DR

The paper investigates how distance-based hierarchical cutting partitions graphs into seed-centered components and builds a dendrogram describing the resulting hierarchy. It compares three network models—ER, BA, and GEO—and studies two seed-selection strategies: non-preferential and degree-preferential. Geometric networks consistently yield the most balanced, sizeable components with little chaining, BA networks show strong chaining and imbalance, and ER networks are intermediate. Compared with random-walk-based cutting, distance-based methods produce significantly less chaining, and the results suggest tailored heuristics may be needed for scale-free topologies; extensions to more seeds and modular structures are proposed.

Abstract

Graphs and complex networks can be successively separated into connected components associated to respective seed nodes, therefore establishing a respective hierarchical organization. In the present work, we study the properties of the hierarchical structure implied by distance-based cutting of Erdős-Rényi, Barabási-Albert, and a specific geometric network. Two main situations are considered regarding the choice of the seeds: non-preferential and preferential to the respective node degree. Among the obtained findings, we have the tendency of geometrical networks yielding more balanced pairs of connected components along the network progressive separation, presenting little chaining effects, followed by the Erdős-Rényi and Barabási-Albert types of networks. The choice of seeds preferential to the node degree tended to enhance the balance of the connected components in the case of the geometrical networks.

Distance-Based Hierarchical Cutting of Complex Networks with Non-Preferential and Preferential Choice of Seeds

TL;DR

The paper investigates how distance-based hierarchical cutting partitions graphs into seed-centered components and builds a dendrogram describing the resulting hierarchy. It compares three network models—ER, BA, and GEO—and studies two seed-selection strategies: non-preferential and degree-preferential. Geometric networks consistently yield the most balanced, sizeable components with little chaining, BA networks show strong chaining and imbalance, and ER networks are intermediate. Compared with random-walk-based cutting, distance-based methods produce significantly less chaining, and the results suggest tailored heuristics may be needed for scale-free topologies; extensions to more seeds and modular structures are proposed.

Abstract

Graphs and complex networks can be successively separated into connected components associated to respective seed nodes, therefore establishing a respective hierarchical organization. In the present work, we study the properties of the hierarchical structure implied by distance-based cutting of Erdős-Rényi, Barabási-Albert, and a specific geometric network. Two main situations are considered regarding the choice of the seeds: non-preferential and preferential to the respective node degree. Among the obtained findings, we have the tendency of geometrical networks yielding more balanced pairs of connected components along the network progressive separation, presenting little chaining effects, followed by the Erdős-Rényi and Barabási-Albert types of networks. The choice of seeds preferential to the node degree tended to enhance the balance of the connected components in the case of the geometrical networks.
Paper Structure (7 sections, 9 figures, 1 table)

This paper contains 7 sections, 9 figures, 1 table.

Figures (9)

  • Figure 1: An original graph (a) with three seed nodes identified as square nodes in green, blue, and orange. The same graph partitioned (b), in terms of the shortest topological distances, between the three seed nodes. Nodes that have the same distance to two or more seeds are assigned randomly to one of those seeds, in a manner than can be understood as being analog to the concept of Dirichlet tessellation (or Voronoi diagrams, e.g. riedinger1988delaunay). Once partitioned, the graph can be separated (c) into connected components respective to each of the seeds by removing the edges extending between nodes belonging to different partitions. This basic separation approach can then be repeated, yielding a respective hierarchy.
  • Figure 2: Illustration of the recurrent application of the distance-based cutting of networks respectively to $M=3$ seeds, which are shown as square nodes.
  • Figure 3: Dendrogram obtained from the distance-based cutting illustrated in Fig. \ref{['fig:branch_dendrogram']}.
  • Figure 4: The region of possibly observable coordinates $\left(n_1, n_2\right)$, assuming the components to have al least size $N/2$, is defined by four vertices $A, B, C,$ and $E$. The most balanced situation, characterized by the two resulting components having the same size, corresponds to the point $C$. Henceforth, the region within the bounding polygon is separated into two sub-regions delimitated by the polygons $ABDE$ and $CDE$, with the latter being associated to more balanced pairs of connected components which are also relatively large. The total density of observations resulting within these two regions are henceforth expressed as $P_2$ and $P_3$. A portion of the region $ABDE$, namely that comprised within $AFGB$, is also considered to correspond to chained pairs of components, leading to a respective probability $P_1$.
  • Figure 5: Scatterplots of component sizes $\left(n_1, n_2 \right)$ obtained for 2000 networks of ER (a), BA(b), and GEO (c) types considering the uniform choice of the seed nodes. The points on the line segment $AC$ correspond to the very first separation of the original network ($n_1+n_2=N$). The green cross-hair indicates the average $\pm$ standard deviation of the values of $n_1$ and $n_2$. The networks of GEO type resulted in the most balanced components, indicated by the concentration of cases near point E. At the same time, the BA network tended to yield the less balanced connected components, with the observed coordinates covering most of the bounding polygon.
  • ...and 4 more figures