Table of Contents
Fetching ...

On The MCMC Performance In Bernoulli Group Testing And The Random Max Set-Cover Problem

Maxwell Lovig, Ilias Zadik

TL;DR

It is proved that, despite the intriguing success in simulations for small $n$, the class of MCMC methods proposed in previous work for BGT with $m^*$ samples takes super-polynomial-in-$n$ time to identify the infected individuals, when $k=n^{\alpha}$ for $\alpha \in (0,1)$ small enough.

Abstract

The group testing problem is a canonical inference task where one seeks to identify $k$ infected individuals out of a population of $n$ people, based on the outcomes of $m$ group tests. Of particular interest is the case of Bernoulli group testing (BGT), where each individual participates in each test independently and with a fixed probability. BGT is known to be an "information-theoretically" optimal design, as there exists a decoder that can identify with high probability as $n$ grows the infected individuals using $m^*=\log_2 \binom{n}{k}$ BGT tests, which is the minimum required number of tests among \emph{all} group testing designs. An important open question in the field is if a polynomial-time decoder exists for BGT which succeeds also with $m^*$ samples. In a recent paper (Iliopoulos, Zadik COLT '21) some evidence was presented (but no proof) that a simple low-temperature MCMC method could succeed. The evidence was based on a first-moment (or "annealed") analysis of the landscape, as well as simulations that show the MCMC success for $n \approx 1000s$. In this work, we prove that, despite the intriguing success in simulations for small $n$, the class of MCMC methods proposed in previous work for BGT with $m^*$ samples takes super-polynomial-in-$n$ time to identify the infected individuals, when $k=n^α$ for $α\in (0,1)$ small enough. Towards obtaining our results, we establish the tight max-satisfiability thresholds of the random $k$-set cover problem, a result of potentially independent interest in the study of random constraint satisfaction problems.

On The MCMC Performance In Bernoulli Group Testing And The Random Max Set-Cover Problem

TL;DR

It is proved that, despite the intriguing success in simulations for small , the class of MCMC methods proposed in previous work for BGT with samples takes super-polynomial-in- time to identify the infected individuals, when for small enough.

Abstract

The group testing problem is a canonical inference task where one seeks to identify infected individuals out of a population of people, based on the outcomes of group tests. Of particular interest is the case of Bernoulli group testing (BGT), where each individual participates in each test independently and with a fixed probability. BGT is known to be an "information-theoretically" optimal design, as there exists a decoder that can identify with high probability as grows the infected individuals using BGT tests, which is the minimum required number of tests among \emph{all} group testing designs. An important open question in the field is if a polynomial-time decoder exists for BGT which succeeds also with samples. In a recent paper (Iliopoulos, Zadik COLT '21) some evidence was presented (but no proof) that a simple low-temperature MCMC method could succeed. The evidence was based on a first-moment (or "annealed") analysis of the landscape, as well as simulations that show the MCMC success for . In this work, we prove that, despite the intriguing success in simulations for small , the class of MCMC methods proposed in previous work for BGT with samples takes super-polynomial-in- time to identify the infected individuals, when for small enough. Towards obtaining our results, we establish the tight max-satisfiability thresholds of the random -set cover problem, a result of potentially independent interest in the study of random constraint satisfaction problems.

Paper Structure

This paper contains 45 sections, 37 theorems, 261 equations, 8 figures.

Key Result

Theorem 1.1

For Bernoulli group testing, suppose $k=\left\lfloor n^{\alpha} \right\rfloor$ for some constant $\alpha \in (0,1)$ which is less than a sufficiently small constant. If the test size satisfies $N \leq 1.4749 \log_2 \binom{n}{k}$ then b-OGP exists a.a.s. as $n \rightarrow +\infty.$

Figures (8)

  • Figure 1: A realization for an instance of Bernoulli group testing.
  • Figure 2: $\Phi_k$, the maximal proportion of covered sets for some size $k$ set of elements, as a function of $C$ for the random MAX k-set cover problem, where $C$ control the number of "target" sets (or constraints).
  • Figure 3: A realization for an instance of Bernoulli group testing, now with the COMP post-processing applied.
  • Figure 4: Solutions to two differing first moment functions, iliopoulos2021group's unconditional first moment function in blue and our conditional first moment function in black. These plots were made with parameters $\alpha = .01$, $a = 1.17$ and varying $C$ and $n$. The unconditional first moment function is monotonic for all of our chosen values of $C$ and $n$, confirming the analysis done by iliopoulos2021group. Our conditional first moment function is non-monotonic for $C$ sufficiently small and $n$ sufficiently large, confirming our Theorem \ref{['lem:FMFnonMono_0']}.
  • Figure 5: The green and orange regions in the above plot represent the values of $\alpha$ and $C$ for which conditions \ref{['eq:FMFExists1']} and \ref{['eq:FMFExists2']} from Assumption \ref{['as:FMFExists']} are satisfied under the choice of $a$ from the lower boundary of the set \ref{['eq:a:set']} and setting $C_{34}, C_{35} = 0$. Note that the region in green is a subset of the region in orange.
  • ...and 3 more figures

Theorems & Definitions (93)

  • Theorem 1.1: Informal theorem, see Theorem \ref{['thm:QualInc_0']}
  • Corollary 1.2
  • Theorem 1.3
  • Definition 2.1
  • Remark 2.2
  • Definition 2.3
  • Definition 3.1
  • Definition 3.2
  • Lemma 3.3: cojaoghlan2022statistical, Section 9.2.1 (arxiv version)
  • Definition 3.4
  • ...and 83 more