Table of Contents
Fetching ...

Thresholds for Reconstruction of Random Hypergraphs From Graph Projections

Guy Bresler, Chenghao Guo, Yury Polyanskiy

TL;DR

This work studies the problem of reconstructing a random $d$-uniform hypergraph from its graph projection and establishes a density-threshold landscape for exact recovery as a function of $\delta$ in the parameterization $p=n^{-d+1+\delta}$. The authors present the MAP rule as the information-theoretically optimal recovery method and show it is efficiently computable in the sparse regime via a decomposition into constant-size 2-connected components, while also proving when reconstruction is information-theoretically impossible. For $d=3$ they give a precise threshold, and for $d\ge4$ they provide bounds on the threshold, with an efficient algorithm achieving recovery in the feasible regions. The results extend to mildly inhomogeneous random hypergraphs, including HSBM, and yield an optimal recovery algorithm from the similarity matrix, improving prior work. Together, these findings clarify when graph projections preserve the higher-order structure and have practical implications for community detection and hypergraph inference from pairwise data.

Abstract

The graph projection of a hypergraph is a simple graph with the same vertex set and with an edge between each pair of vertices that appear in a hyperedge. We consider the problem of reconstructing a random $d$-uniform hypergraph from its projection. Feasibility of this task depends on $d$ and the density of hyperedges in the random hypergraph. For $d=3$ we precisely determine the threshold, while for $d\geq 4$ we give bounds. All of our feasibility results are obtained by exhibiting an efficient algorithm for reconstructing the original hypergraph, while infeasibility is information-theoretic. Our results also apply to mildly inhomogeneous random hypergrahps, including hypergraph stochastic block models (HSBM). A consequence of our results is an optimal HSBM recovery algorithm, improving on a result of Guadio and Joshi in 2023.

Thresholds for Reconstruction of Random Hypergraphs From Graph Projections

TL;DR

This work studies the problem of reconstructing a random -uniform hypergraph from its graph projection and establishes a density-threshold landscape for exact recovery as a function of in the parameterization . The authors present the MAP rule as the information-theoretically optimal recovery method and show it is efficiently computable in the sparse regime via a decomposition into constant-size 2-connected components, while also proving when reconstruction is information-theoretically impossible. For they give a precise threshold, and for they provide bounds on the threshold, with an efficient algorithm achieving recovery in the feasible regions. The results extend to mildly inhomogeneous random hypergraphs, including HSBM, and yield an optimal recovery algorithm from the similarity matrix, improving prior work. Together, these findings clarify when graph projections preserve the higher-order structure and have practical implications for community detection and hypergraph inference from pairwise data.

Abstract

The graph projection of a hypergraph is a simple graph with the same vertex set and with an edge between each pair of vertices that appear in a hyperedge. We consider the problem of reconstructing a random -uniform hypergraph from its projection. Feasibility of this task depends on and the density of hyperedges in the random hypergraph. For we precisely determine the threshold, while for we give bounds. All of our feasibility results are obtained by exhibiting an efficient algorithm for reconstructing the original hypergraph, while infeasibility is information-theoretic. Our results also apply to mildly inhomogeneous random hypergrahps, including hypergraph stochastic block models (HSBM). A consequence of our results is an optimal HSBM recovery algorithm, improving on a result of Guadio and Joshi in 2023.

Paper Structure

This paper contains 55 sections, 38 theorems, 132 equations, 7 figures, 1 table, 4 algorithms.

Key Result

Lemma 1.1

For any $d\ge 4$ and any $0\le \delta_1<\delta_2\le 1$, if exact recovery is information theoretically possible (or efficiently achievable) when $\delta=\delta_2$, then exact recovery is also information theoretically possible (or efficiently achievable) for $\delta=\delta_1$.

Figures (7)

  • Figure 1: An example of 2-connectivity and 2-connected components when $d=3$.
  • Figure 2: A graph with non-unique minimum preimage in the case $d=3$. The green hyperedges are the two possible minimum preimages.
  • Figure 3: Relation between different thresholds. The maximum clique cover algorithm $\mathcal{A}_c$ succeeds with high probability up to $\delta=\frac{d-3}{d}$. The MAP algorithm is efficient up to $\frac{d-1}{d+1}$ and succeeds with high probability up to threshold $\delta^*_{d}$. If $\delta_d^a<\frac{d-1}{d+1}$, then $\delta^*_{d}$ is the same as the ambiguous threshold $\delta_d^a$.
  • Figure 4: An illustration of $N(K)$ when $d=3$. Here $K$ only has one hyperedge $\{1,2,3\}$, colored in green. $N(K)$ contains three possible 2-neighbors of $\mathrm{Cli}(\mathrm{Proj}(K))=K$, colored in blue.
  • Figure 5: An example of an element in $\mathrm{Grow}(K,h)$, consists of 3 hyperedges, $\{1,2,3\},\{1,4,7\}\text{ and }\{2,4,8\}$. Here $K$ contains one hyperedge $\{1,2,3\}$. $h=\{1,2,4\}$. For $h$ to be included in the 2-connected component, one way is to include $S_1^{K,h} = \{1,4\}$ and $S_2^{K,h} = \{2,4\}$ respectively in two hyperedges.
  • ...and 2 more figures

Theorems & Definitions (55)

  • Remark 1
  • Lemma 1.1: Monotonicity in $\delta$
  • Corollary 1.2: Threshold for Exact Recovery
  • Theorem 1.1
  • Theorem 1.2
  • Theorem 1.3
  • Theorem 1.4
  • proof
  • Remark 2
  • Theorem 2.1
  • ...and 45 more