Table of Contents
Fetching ...

Detecting hyperbolic geometry in networks: why triangles are not enough

Riccardo Michielan, Nelly Litvak, Clara Stegehuis

TL;DR

The paper tackles the problem of detecting latent hyperbolic geometry in networks from observed connections. It shows that conventional triangle counts and average clustering can fail to reveal geometry in heavy-tailed GIRG models, and introduces a weighted-triangles statistic $W$ to amplify geometry-driven triangles. The authors prove that $\mathbb{E}[W]=O(1)$ in non-geometric IRGs while $\mathbb{E}[W]=\Omega(n)$ and $\mathrm{Var}[W]=O(n)$ in geometric GIRGs, implying $W$ grows linearly with $n$ in geometric networks and remains bounded otherwise, with concentration results. They validate the theory on synthetic models and real-world networks (ArXiv, CAIDA, Gnutella, Bitcoin), showing $W$ tracks hidden geometry where standard metrics do not. The findings offer a practical, scalable diagnostic for latent geometry in networks and highlight the limitations of triangle-based statistics.

Abstract

In the past decade, geometric network models have received vast attention in the literature. These models formalize the natural idea that similar vertices are likely to connect. Because of that, these models are able to adequately capture many common structural properties of real-world networks, such as self-invariance and high clustering. Indeed, many real-world networks can be accurately modeled by positioning vertices of a network graph in hyperbolic spaces. Nevertheless, if one observes only the network connections, the presence of geometry is not always evident. Currently, triangle counts and clustering coefficients are the standard statistics to signal the presence of geometry. In this paper we show that triangle counts or clustering coefficients are insufficient because they fail to detect geometry induced by hyperbolic spaces. We therefore introduce a novel triangle-based statistic, which weighs triangles based on their strength of evidence for geometry. We show analytically, as well as on synthetic and real-world data, that this is a powerful statistic to detect hyperbolic geometry in networks.

Detecting hyperbolic geometry in networks: why triangles are not enough

TL;DR

The paper tackles the problem of detecting latent hyperbolic geometry in networks from observed connections. It shows that conventional triangle counts and average clustering can fail to reveal geometry in heavy-tailed GIRG models, and introduces a weighted-triangles statistic to amplify geometry-driven triangles. The authors prove that in non-geometric IRGs while and in geometric GIRGs, implying grows linearly with in geometric networks and remains bounded otherwise, with concentration results. They validate the theory on synthetic models and real-world networks (ArXiv, CAIDA, Gnutella, Bitcoin), showing tracks hidden geometry where standard metrics do not. The findings offer a practical, scalable diagnostic for latent geometry in networks and highlight the limitations of triangle-based statistics.

Abstract

In the past decade, geometric network models have received vast attention in the literature. These models formalize the natural idea that similar vertices are likely to connect. Because of that, these models are able to adequately capture many common structural properties of real-world networks, such as self-invariance and high clustering. Indeed, many real-world networks can be accurately modeled by positioning vertices of a network graph in hyperbolic spaces. Nevertheless, if one observes only the network connections, the presence of geometry is not always evident. Currently, triangle counts and clustering coefficients are the standard statistics to signal the presence of geometry. In this paper we show that triangle counts or clustering coefficients are insufficient because they fail to detect geometry induced by hyperbolic spaces. We therefore introduce a novel triangle-based statistic, which weighs triangles based on their strength of evidence for geometry. We show analytically, as well as on synthetic and real-world data, that this is a powerful statistic to detect hyperbolic geometry in networks.
Paper Structure (13 sections, 38 equations, 2 figures, 1 table)

This paper contains 13 sections, 38 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: Triangle counts, clustering coefficient and weighted triangles for the Inhomogeneous random graph (IRG) and Geometric Inhomogeneous random graph (GIRG). Dots are the averages over 100 samples of IRG and GIRG, against the number of vertices $n$. Grey lines are linear interpolations between dots. (a) When $\tau>7/3$, the number of triangles increases linearly in GIRG, but slower than linearly for IRG. (b) When $\tau <7/3$, the triangles in the non-geometric and geometric model share the same scaling. (c) When $\tau$ is large, $\overline{C}$ differs significantly between the non-geometric and geometric model. (d) When $\tau$ is small, $\overline{C}$ is qualitatively similar in both models. (e) (f) $W$ is significantly different both for small and large values of $\tau$. In particular, in non-geometric models $W$ remains bounded below 1/6.
  • Figure 2: Triangle counts $\triangle$ (blue), average clustering coefficient $\overline{C}$ (orange), and weighted triangles $W$ (green), computed for the data sets: (a) ArXiv collaboration (cond-mat); (b) CAIDA autonomous systems relationship; (c) Gnutella peer-to-peer connections; (d) Bitcoin transactions. The red line is the simple linear regression of $W$.