Table of Contents
Fetching ...

The Fundamental Limits of Recovering Planted Subgraphs

Daniel Lee, Francisco Pernice, Amit Rajaraman, Ilias Zadik

TL;DR

This work provides a comprehensive theory for the limiting MMSE in recovering planted subgraphs from noisy graphs. It introduces φ_q thresholds and the onion decomposition to precisely describe the MMSE curve for any weakly dense planted subgraph and proves these thresholds determine the sharp transitions between recoverable and unrecoverable fractions of the planted structure. The authors further generalize to minimax recovery rates for arbitrary monotone properties, via ψ_q thresholds and a fractional-expectation duality, yielding computable structure in the dense regime and a principled statistical meaning for subgraph thresholds. The results bridge probabilistic combinatorics and Bayesian inference, offering a versatile framework for understanding phase transitions and minimax rates in high-dimensional planted-structure problems, with potential algorithmic implications through the onion-based computability and duality techniques.

Abstract

Given an arbitrary subgraph $H=H_n$ and $p=p_n \in (0,1)$, the planted subgraph model is defined as follows. A statistician observes the union a random copy $H^*$ of $H$, together with random noise in the form of an instance of an Erdos-Renyi graph $G(n,p)$. Their goal is to recover the planted $H^*$ from the observed graph. Our focus in this work is to understand the minimum mean squared error (MMSE) for sufficiently large $n$. A recent paper [MNSSZ23] characterizes the graphs for which the limiting MMSE curve undergoes a sharp phase transition from $0$ to $1$ as $p$ increases, a behavior known as the all-or-nothing phenomenon, up to a mild density assumption on $H$. In this paper, we provide a formula for the limiting MMSE curve for any graph $H=H_n$, up to the same mild density assumption. This curve is expressed in terms of a variational formula over pairs of subgraphs of $H$, and is inspired by the celebrated subgraph expectation thresholds from the probabilistic combinatorics literature [KK07]. Furthermore, we give a polynomial-time description of the optimizers of this variational problem. This allows one to efficiently approximately compute the MMSE curve for any dense graph $H$ when $n$ is large enough. The proof relies on a novel graph decomposition of $H$ as well as a new minimax theorem which may be of independent interest. Our results generalize to the setting of minimax rates of recovering arbitrary monotone boolean properties planted in random noise, where the statistician observes the union of a planted minimal element $A \subseteq [N]$ of a monotone property and a random $Ber(p)^{\otimes N}$ vector. In this setting, we provide a variational formula inspired by the so-called "fractional" expectation threshold [Tal10], again describing the MMSE curve (in this case up to a multiplicative constant) for large enough $n$.

The Fundamental Limits of Recovering Planted Subgraphs

TL;DR

This work provides a comprehensive theory for the limiting MMSE in recovering planted subgraphs from noisy graphs. It introduces φ_q thresholds and the onion decomposition to precisely describe the MMSE curve for any weakly dense planted subgraph and proves these thresholds determine the sharp transitions between recoverable and unrecoverable fractions of the planted structure. The authors further generalize to minimax recovery rates for arbitrary monotone properties, via ψ_q thresholds and a fractional-expectation duality, yielding computable structure in the dense regime and a principled statistical meaning for subgraph thresholds. The results bridge probabilistic combinatorics and Bayesian inference, offering a versatile framework for understanding phase transitions and minimax rates in high-dimensional planted-structure problems, with potential algorithmic implications through the onion-based computability and duality techniques.

Abstract

Given an arbitrary subgraph and , the planted subgraph model is defined as follows. A statistician observes the union a random copy of , together with random noise in the form of an instance of an Erdos-Renyi graph . Their goal is to recover the planted from the observed graph. Our focus in this work is to understand the minimum mean squared error (MMSE) for sufficiently large . A recent paper [MNSSZ23] characterizes the graphs for which the limiting MMSE curve undergoes a sharp phase transition from to as increases, a behavior known as the all-or-nothing phenomenon, up to a mild density assumption on . In this paper, we provide a formula for the limiting MMSE curve for any graph , up to the same mild density assumption. This curve is expressed in terms of a variational formula over pairs of subgraphs of , and is inspired by the celebrated subgraph expectation thresholds from the probabilistic combinatorics literature [KK07]. Furthermore, we give a polynomial-time description of the optimizers of this variational problem. This allows one to efficiently approximately compute the MMSE curve for any dense graph when is large enough. The proof relies on a novel graph decomposition of as well as a new minimax theorem which may be of independent interest. Our results generalize to the setting of minimax rates of recovering arbitrary monotone boolean properties planted in random noise, where the statistician observes the union of a planted minimal element of a monotone property and a random vector. In this setting, we provide a variational formula inspired by the so-called "fractional" expectation threshold [Tal10], again describing the MMSE curve (in this case up to a multiplicative constant) for large enough .

Paper Structure

This paper contains 32 sections, 27 theorems, 160 equations, 1 figure.

Key Result

Theorem 1.1

Let $\varepsilon > 0$, and $H = H_n$ an arbitrary weakly dense graph. Then, for sufficiently large $n$, there exists an integer $1\leqslant M\leqslant |H|$ and thresholds $1=q_M>\cdots>q_1>q_0=0$, such that the following holds for the planted subgraph model corresponding to $H$.

Figures (1)

  • Figure 1: A pictorial representation of the onion decomposition of a graph (left) and the corresponding MMSE curve (right). Each $q_i$ represents the cumulative fraction of the graph included in all layers starting from the center till the $i$th layer; hence $q_4=1.$ The critical thresholds described above are denoted by $\varphi$, following the notation in our theorem statement below.

Theorems & Definitions (65)

  • Theorem 1.1: Informal version of \ref{['thm:wek_dense']} and \ref{['lem:onion-universality']}(c)
  • Definition 1.2
  • Corollary 1.3: Informal, based on \ref{['lem:onion-universality']} and \ref{['cor:wek-dense-improved']}
  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Definition 2.4: Fractional expectation threshold
  • Theorem 3.1
  • Remark 3.2: The MMSE curve using the discontinuities of $\varphi_q$
  • Definition 3.3: Onion decomposition of $H$
  • ...and 55 more