Information-Theoretic Thresholds for Bipartite Latent-Space Graphs Under Noisy Observations
Andreas Göbel, Marcus Pappik, Leon Schiller
TL;DR
The paper establishes tight information-theoretic thresholds for detecting latent geometry in bipartite Gaussian random geometric graphs under a binary masking process. It introduces a novel Fourier-analytic framework that bounds signed subgraph counts by exploiting cancellations in the characteristic-function expansions, enabling control over large subgraphs and leading to precise phase diagrams that depend on the latent dimension $d$ and mask density $q$. A conditional second-moment method is developed to derive hardness results and to bridge known-vs-unknown mask settings, showing that knowing the mask lowers the effective sparsity threshold (roughly replacing $q$ with $q^2$ in the analysis). The results imply that there is no computational-statistical gap in the considered regimes and yield efficient tests based on wedges and 4-cycles, with extensions suggested to non-bipartite and sparser settings. Overall, the work advances understanding of latent geometry detectability in noisy, high-dimensional graph models and provides tools potentially applicable to related detection problems.
Abstract
We study information-theoretic phase transitions for the detectability of latent geometry in bipartite random geometric graphs RGGs with Gaussian d-dimensional latent vectors while only a subset of edges carries latent information determined by a random mask with i.i.d. Bern(q) entries. For any fixed edge density p in (0,1) we determine essentially tight thresholds for this problem as a function of d and q. Our results show that the detection problem is substantially easier if the mask is known upfront compared to the case where the mask is hidden. Our analysis is built upon a novel Fourier-analytic framework for bounding signed subgraph counts in Gaussian random geometric graphs that exploits cancellations which arise after approximating characteristic functions by an appropriate power series. The resulting bounds are applicable to much larger subgraphs than considered in previous work which enables tight information-theoretic bounds, while the bounds considered in previous works only lead to lower bounds from the lens of low-degree polynomials. As a consequence we identify the optimal information-theoretic thresholds and rule out computational-statistical gaps. Our bounds further improve upon the bounds on Fourier coefficients of random geometric graphs recently given by Bangachev and Bresler [STOC'24] in the dense, bipartite case. The techniques also extend to sparser and non-bipartite settings, at least if the considered subgraphs are sufficiently small. We furhter believe that they might help resolve open questions for related detection problems.
