Table of Contents
Fetching ...

Random Subgraph Detection Using Queries

Wasim Huleihel, Arya Mazumdar, Soumyabrata Pal

TL;DR

We study planted densest subgraph detection under a non-adaptive edge-query model, formalizing the QPDS problem and aiming to characterize information-theoretic limits and efficient algorithms. The authors derive tight, parameterized bounds depending on $\chi^2(p||q)$ and $d_{KL}(p||q)$, and present two non-adaptive tests (a scan-based method and a degree-based method) that achieve the statistical thresholds, along with a polynomial-time detector that incurs higher query cost. They establish a non-adaptive lower bound ${\sf Q} < (2-\varepsilon)\frac{n^2}{k^2\chi^4(p||q)}\log^2\frac{n}{k}$ and show statistical sufficiency when $k\ge (2+\varepsilon_0)\frac{\log n}{d_{KL}(p||q)}$, as well as computational sufficiency ${\sf Q}=O(\frac{n^3}{k^3\chi^2(p||q)}\log^3 n)$ under a regime $k=\Omega(\sqrt{n\log n/\chi^2(p||q)})$, with extensions to general submatrix detection using $\chi^2({\mathcal P}||{\mathcal Q})$ and $d_{KL}({\mathcal P}||{\mathcal Q})$. The work resolves open questions and maps the statistical-computational landscape for query-limited detection, informing active data acquisition and privacy-aware network analysis.

Abstract

The planted densest subgraph detection problem refers to the task of testing whether in a given (random) graph there is a subgraph that is unusually dense. Specifically, we observe an undirected and unweighted graph on $n$ vertices. Under the null hypothesis, the graph is a realization of an Erdős-Rényi graph with edge probability (or, density) $q$. Under the alternative, there is a subgraph on $k$ vertices with edge probability $p>q$. The statistical as well as the computational barriers of this problem are well-understood for a wide range of the edge parameters $p$ and $q$. In this paper, we consider a natural variant of the above problem, where one can only observe a relatively small part of the graph using adaptive edge queries. For this model, we determine the number of queries necessary and sufficient (accompanied with a quasi-polynomial optimal algorithm) for detecting the presence of the planted subgraph. We also propose a polynomial-time algorithm which is able to detect the planted subgraph, albeit with more queries compared to the above lower bound. We conjecture that in the leftover regime, no polynomial-time algorithms exist. Our results resolve two open questions posed in the past literature.

Random Subgraph Detection Using Queries

TL;DR

We study planted densest subgraph detection under a non-adaptive edge-query model, formalizing the QPDS problem and aiming to characterize information-theoretic limits and efficient algorithms. The authors derive tight, parameterized bounds depending on and , and present two non-adaptive tests (a scan-based method and a degree-based method) that achieve the statistical thresholds, along with a polynomial-time detector that incurs higher query cost. They establish a non-adaptive lower bound and show statistical sufficiency when , as well as computational sufficiency under a regime , with extensions to general submatrix detection using and . The work resolves open questions and maps the statistical-computational landscape for query-limited detection, informing active data acquisition and privacy-aware network analysis.

Abstract

The planted densest subgraph detection problem refers to the task of testing whether in a given (random) graph there is a subgraph that is unusually dense. Specifically, we observe an undirected and unweighted graph on vertices. Under the null hypothesis, the graph is a realization of an Erdős-Rényi graph with edge probability (or, density) . Under the alternative, there is a subgraph on vertices with edge probability . The statistical as well as the computational barriers of this problem are well-understood for a wide range of the edge parameters and . In this paper, we consider a natural variant of the above problem, where one can only observe a relatively small part of the graph using adaptive edge queries. For this model, we determine the number of queries necessary and sufficient (accompanied with a quasi-polynomial optimal algorithm) for detecting the presence of the planted subgraph. We also propose a polynomial-time algorithm which is able to detect the planted subgraph, albeit with more queries compared to the above lower bound. We conjecture that in the leftover regime, no polynomial-time algorithms exist. Our results resolve two open questions posed in the past literature.

Paper Structure

This paper contains 11 sections, 2 theorems, 57 equations, 1 figure, 2 algorithms.

Key Result

Theorem 1

Consider the $\mathsf{QPDS}(n,k,p,q)$ detection problem, and let $\epsilon>0$ be arbitrary. The following statements hold.

Figures (1)

  • Figure 1: Phase diagram for detecting the presence of a planted dense subgraph, as a function of the dense subgraph size $k = \Theta(n^{\beta})$ and the number of non-adaptive edge queries $\mathsf{Q}=\Theta(n^{\alpha})$.

Theorems & Definitions (9)

  • Definition 1: $\mathsf{PDS}$ detection problem
  • Definition 2: Oracle/Edge queries
  • Theorem 1: Detecting a planted densest subgraph
  • Conjecture 1: Planted clique conjecture
  • Definition 3: General submatrix detection
  • Definition 4: Oracle/Entries queries
  • Lemma 1: Total number of planted edge queries
  • Remark 1
  • proof : Proof of Lemma \ref{['lem:boundTotal']}