Table of Contents
Fetching ...

Faster algorithms for the alignment of sparse correlated Erdös-Rényi random graphs

Andrea Muratori, Guilhem Semerjian

TL;DR

A family of faster algorithms for the graph alignment problem are presented, showing through numerical simulations that their accuracy is only slightly reduced with respect to the original one and conjecture that they undergo, in the large λ limit, phase transitions at modified Otter’s thresholds α^>α.

Abstract

The correlated Erdös-Rényi random graph ensemble is a probability law on pairs of graphs with $n$ vertices, parametrized by their average degree $λ$ and their correlation coefficient $s$. It can be used as a benchmark for the graph alignment problem, in which the labels of the vertices of one of the graphs are reshuffled by an unknown permutation; the goal is to infer this permutation and thus properly match the pairs of vertices in both graphs. A series of recent works has unveiled the role of Otter's constant $α$ (that controls the exponential rate of growth of the number of unlabeled rooted trees as a function of their sizes) in this problem: for $s>\sqrtα$ and $λ$ large enough it is possible to recover in a time polynomial in $n$ a positive fraction of the hidden permutation. The exponent of this polynomial growth is however quite large and depends on the other parameters, which limits the range of applications of the algorithm. In this work we present a family of faster algorithms for this task, show through numerical simulations that their accuracy is only slightly reduced with respect to the original one, and conjecture that they undergo, in the large $λ$ limit, phase transitions at modified Otter's thresholds $\sqrt{\widehatα}>\sqrtα$, with $\widehatα$ related to the enumeration of a restricted family of trees.

Faster algorithms for the alignment of sparse correlated Erdös-Rényi random graphs

TL;DR

A family of faster algorithms for the graph alignment problem are presented, showing through numerical simulations that their accuracy is only slightly reduced with respect to the original one and conjecture that they undergo, in the large λ limit, phase transitions at modified Otter’s thresholds α^>α.

Abstract

The correlated Erdös-Rényi random graph ensemble is a probability law on pairs of graphs with vertices, parametrized by their average degree and their correlation coefficient . It can be used as a benchmark for the graph alignment problem, in which the labels of the vertices of one of the graphs are reshuffled by an unknown permutation; the goal is to infer this permutation and thus properly match the pairs of vertices in both graphs. A series of recent works has unveiled the role of Otter's constant (that controls the exponential rate of growth of the number of unlabeled rooted trees as a function of their sizes) in this problem: for and large enough it is possible to recover in a time polynomial in a positive fraction of the hidden permutation. The exponent of this polynomial growth is however quite large and depends on the other parameters, which limits the range of applications of the algorithm. In this work we present a family of faster algorithms for this task, show through numerical simulations that their accuracy is only slightly reduced with respect to the original one, and conjecture that they undergo, in the large limit, phase transitions at modified Otter's thresholds , with related to the enumeration of a restricted family of trees.
Paper Structure (22 sections, 86 equations, 10 figures)

This paper contains 22 sections, 86 equations, 10 figures.

Figures (10)

  • Figure 1: A sketch of the phase diagram for the partial recovery task in the correlated sparse Erdős-Rényi ensemble with finite average degree $\lambda$ and correlation $s$. The information-theoretic impossible region corresponds to $\lambda s <1$; its complement is divided in an easy phase with boundary $s_{\rm algo}(\lambda)$ where polynomial-time algorithms are known to exist, and a hard phase where they are yet to be discovered (or proven not to exist). The dotted line represents the conjectured algorithmic phase transition line $\widehat{s}_{\rm algo}(\lambda)$ for the procedures introduced in the present work.
  • Figure 2: An example of a rooted tree $N$ whose root has $l=4$ offsprings, its decomposition as a list of subtrees, and its unlabeled version defined in terms of the number of copies of unlabeled subtrees rooted at the offsprings of the root.
  • Figure 3: Comparison of the scores $L_{i,i'}^{(d)}$ and their approximations $\widehat{L}_{i,i'}^{(d)}$ on a single graph with $n=1024$, $\lambda=2.4$, $s=0.9$, $d=4$, $m=2$ (left panel) and $m=3$ (right panel). Each point in these scatter plots corresponds to a pair $i,i'$ of vertices of both graphs, the horizontal axis corresponding to $L_{i,i'}^{(d)}$, the vertical one to $\widehat{L}_{i,i'}^{(d)}$. Correctly matched pairs with $i'=\pi_\star(i)$ are highlighted in red. The $x$ axis is in log-scale while the $y$ axis is logarithmic on both the positive and negative sides, with a linear region around the origin (in gray on the figure) to join them (we have shrunk it so that every dot displayed is actually in log-scale).
  • Figure 4: Left panel: the average overlap between the true and estimated permutations as a function of the depth parameter $d$ for $n=2048$, $\lambda=2.4$, $s=0.86$, $m=\infty$ (blue curve), $m=2$ (orange curve) and $m=3$ (green curve). Each point corresponds to an average over 50 independent samples. Right panel: The average overlap as a function of $s$ for $n=512$, $\lambda=1.2$, $m=\infty$ (blue curve), $m=2$ (orange curve) and $m=3$ (green curve). Each point is an average over 50 samples.
  • Figure 5: The average overlap as a function of $s$, for $n=1024$ (left column) and $n=2048$ (right column), $m=2$ (top row) and $m=3$ (bottom row), and several values of $\lambda$ (see the keys). Each point is an average over 50 samples.
  • ...and 5 more figures