Random Subgraph Detection Using Queries
Wasim Huleihel, Arya Mazumdar, Soumyabrata Pal
TL;DR
We study planted densest subgraph detection under a non-adaptive edge-query model, formalizing the QPDS problem and aiming to characterize information-theoretic limits and efficient algorithms. The authors derive tight, parameterized bounds depending on $\chi^2(p||q)$ and $d_{KL}(p||q)$, and present two non-adaptive tests (a scan-based method and a degree-based method) that achieve the statistical thresholds, along with a polynomial-time detector that incurs higher query cost. They establish a non-adaptive lower bound ${\sf Q} < (2-\varepsilon)\frac{n^2}{k^2\chi^4(p||q)}\log^2\frac{n}{k}$ and show statistical sufficiency when $k\ge (2+\varepsilon_0)\frac{\log n}{d_{KL}(p||q)}$, as well as computational sufficiency ${\sf Q}=O(\frac{n^3}{k^3\chi^2(p||q)}\log^3 n)$ under a regime $k=\Omega(\sqrt{n\log n/\chi^2(p||q)})$, with extensions to general submatrix detection using $\chi^2({\mathcal P}||{\mathcal Q})$ and $d_{KL}({\mathcal P}||{\mathcal Q})$. The work resolves open questions and maps the statistical-computational landscape for query-limited detection, informing active data acquisition and privacy-aware network analysis.
Abstract
The planted densest subgraph detection problem refers to the task of testing whether in a given (random) graph there is a subgraph that is unusually dense. Specifically, we observe an undirected and unweighted graph on $n$ vertices. Under the null hypothesis, the graph is a realization of an Erdős-Rényi graph with edge probability (or, density) $q$. Under the alternative, there is a subgraph on $k$ vertices with edge probability $p>q$. The statistical as well as the computational barriers of this problem are well-understood for a wide range of the edge parameters $p$ and $q$. In this paper, we consider a natural variant of the above problem, where one can only observe a relatively small part of the graph using adaptive edge queries. For this model, we determine the number of queries necessary and sufficient (accompanied with a quasi-polynomial optimal algorithm) for detecting the presence of the planted subgraph. We also propose a polynomial-time algorithm which is able to detect the planted subgraph, albeit with more queries compared to the above lower bound. We conjecture that in the leftover regime, no polynomial-time algorithms exist. Our results resolve two open questions posed in the past literature.
