The Fundamental Limits of Recovering Planted Subgraphs
Daniel Lee, Francisco Pernice, Amit Rajaraman, Ilias Zadik
TL;DR
This work provides a comprehensive theory for the limiting MMSE in recovering planted subgraphs from noisy graphs. It introduces φ_q thresholds and the onion decomposition to precisely describe the MMSE curve for any weakly dense planted subgraph and proves these thresholds determine the sharp transitions between recoverable and unrecoverable fractions of the planted structure. The authors further generalize to minimax recovery rates for arbitrary monotone properties, via ψ_q thresholds and a fractional-expectation duality, yielding computable structure in the dense regime and a principled statistical meaning for subgraph thresholds. The results bridge probabilistic combinatorics and Bayesian inference, offering a versatile framework for understanding phase transitions and minimax rates in high-dimensional planted-structure problems, with potential algorithmic implications through the onion-based computability and duality techniques.
Abstract
Given an arbitrary subgraph $H=H_n$ and $p=p_n \in (0,1)$, the planted subgraph model is defined as follows. A statistician observes the union a random copy $H^*$ of $H$, together with random noise in the form of an instance of an Erdos-Renyi graph $G(n,p)$. Their goal is to recover the planted $H^*$ from the observed graph. Our focus in this work is to understand the minimum mean squared error (MMSE) for sufficiently large $n$. A recent paper [MNSSZ23] characterizes the graphs for which the limiting MMSE curve undergoes a sharp phase transition from $0$ to $1$ as $p$ increases, a behavior known as the all-or-nothing phenomenon, up to a mild density assumption on $H$. In this paper, we provide a formula for the limiting MMSE curve for any graph $H=H_n$, up to the same mild density assumption. This curve is expressed in terms of a variational formula over pairs of subgraphs of $H$, and is inspired by the celebrated subgraph expectation thresholds from the probabilistic combinatorics literature [KK07]. Furthermore, we give a polynomial-time description of the optimizers of this variational problem. This allows one to efficiently approximately compute the MMSE curve for any dense graph $H$ when $n$ is large enough. The proof relies on a novel graph decomposition of $H$ as well as a new minimax theorem which may be of independent interest. Our results generalize to the setting of minimax rates of recovering arbitrary monotone boolean properties planted in random noise, where the statistician observes the union of a planted minimal element $A \subseteq [N]$ of a monotone property and a random $Ber(p)^{\otimes N}$ vector. In this setting, we provide a variational formula inspired by the so-called "fractional" expectation threshold [Tal10], again describing the MMSE curve (in this case up to a multiplicative constant) for large enough $n$.
