Nonadaptive Noise-Resilient Group Testing with Order-Optimal Tests and Fast-and-Reliable Decoding

Venkatesan Guruswami; Hsin-Po Wang

Nonadaptive Noise-Resilient Group Testing with Order-Optimal Tests and Fast-and-Reliable Decoding

Venkatesan Guruswami, Hsin-Po Wang

TL;DR

The paper tackles nonadaptive GT under noise by introducing Gacha GT, a modular framework that unites list-decodable and list-recoverable coding ideas with expander-based test designs. The core innovation is a folded Reed--Solomon construction that yields probabilistic list decoding with a hash-table–style test encoding, enabling $m=\mathcal{O}_Z(\sigma k \log(n) 2^{\mathcal{O}(\tau)})$ tests and decoding time $\mathcal{O}_Z(\sigma k\mathrm{poly}(\log(\sigma n)) 2^{\mathcal{O}(\tau)})$, while ensuring $k\exp(-\sigma \log_2(n)^{1-1/\tau})$ average misclassifications. By composing a hierarchy of gadgets (parallel, serial, and pyramid formations) and applying denoising steps (including Barg--Zémor capacity-achieving codes and channel downgrading), Gacha achieves robust performance across a wide range of parameter regimes and binary-input channels, improving partial-recovery, exact-recovery, and worst-case GT scenarios. The framework provides a modular path to scalable, noise-resilient GT with near-optimal test complexity and fast decoding, with potential impact on applications from heavy hitters to IoT device identification. Overall, Gacha demonstrates that carefully integrated coding-theoretic gadgets can drive order-optimal nonadaptive GT under realistic noisy conditions.

Abstract

Group testing (GT) is the Boolean version of spare signal recovery and, due to its simplicity, a marketplace for ideas that can be brought to bear upon related problems, such as heavy hitters, compressed sensing, and multiple access channels. The definition of a "good" GT varies from one buyer to another, but it generally includes (i) usage of nonadaptive tests, (ii) limiting to $O(k \log n)$ tests, (iii) resiliency to test noise, (iv) $O(k \mathrm{poly}(\log n))$ decoding time, and (v) lack of mistakes. In this paper, we propose $Gacha~GT$. Gacha is an elementary and self-contained, versatile and unified scheme that, for the first time, satisfies all criteria for a fairly large region of parameters, namely when $\log k < \log(n)^{1-1/O(1)}$. Outside this parameter region, Gacha can be specialized to outperform the state-of-the-art partial-recovery GTs, exact-recovery GTs, and worst-case GTs. The new idea Gacha brings to the market is a redesigned Reed--Solomon code for probabilistic list-decoding at diminishing code rates over reasonably-large alphabets. Normally, list-decoding a vanilla Reed--Solomon code is equivalent to the nontrivial task of identifying the subsets of points that fit low-degree polynomials. In this paper, we explicitly tell the decoder which points belong to the same polynomial, thus reducing the complexity and enabling the improvement on GT.

Nonadaptive Noise-Resilient Group Testing with Order-Optimal Tests and Fast-and-Reliable Decoding

TL;DR

tests and decoding time

, while ensuring

average misclassifications. By composing a hierarchy of gadgets (parallel, serial, and pyramid formations) and applying denoising steps (including Barg--Zémor capacity-achieving codes and channel downgrading), Gacha achieves robust performance across a wide range of parameter regimes and binary-input channels, improving partial-recovery, exact-recovery, and worst-case GT scenarios. The framework provides a modular path to scalable, noise-resilient GT with near-optimal test complexity and fast decoding, with potential impact on applications from heavy hitters to IoT device identification. Overall, Gacha demonstrates that carefully integrated coding-theoretic gadgets can drive order-optimal nonadaptive GT under realistic noisy conditions.

Abstract

tests, (iii) resiliency to test noise, (iv)

decoding time, and (v) lack of mistakes. In this paper, we propose

. Gacha is an elementary and self-contained, versatile and unified scheme that, for the first time, satisfies all criteria for a fairly large region of parameters, namely when

. Outside this parameter region, Gacha can be specialized to outperform the state-of-the-art partial-recovery GTs, exact-recovery GTs, and worst-case GTs. The new idea Gacha brings to the market is a redesigned Reed--Solomon code for probabilistic list-decoding at diminishing code rates over reasonably-large alphabets. Normally, list-decoding a vanilla Reed--Solomon code is equivalent to the nontrivial task of identifying the subsets of points that fit low-degree polynomials. In this paper, we explicitly tell the decoder which points belong to the same polynomial, thus reducing the complexity and enabling the improvement on GT.

Paper Structure (38 sections, 17 theorems, 8 equations, 16 figures, 7 tables)

This paper contains 38 sections, 17 theorems, 8 equations, 16 figures, 7 tables.

Introduction
We organize our paper as follows.
Problem statement and the marketplace of solutions
Notations and problem statement
The decoding problem
Our new results and implications
A Toy Example of Gacha GT: Assuming
A premature blueprint that does not work (but is inspirational)
An unconventional list-decodability challenge
A code design to meet Section \ref{['sec:circle']}'s challenge
How to list-decode Section \ref{['sec:fold']}'s design
Apply the list-decoding idea to GT
How to decode Section \ref{['sec:weight']}'s GT design
Gadgets that Improve Gacha
Repeat to increase throughput
...and 23 more sections

Key Result

Theorem 1

Let $Z$ be any binary-input channel that models the test noise and $\mathcal{O}_Z$ hide a constant that depends only on $Z$. Let $\sigma \geqslant 1$ and $\tau \geqslant 2$ be free integer parameters. Gacha is a randomized GT scheme that uses $m = \mathcal{O}_Z(\sigma k \log(n) 2^{\mathcal{O}(\tau)}

Figures (16)

Figure 1: Gacha GT (this work): reshape the $\nu$-bit phone number into a $\sqrt\nu \times \sqrt\nu$ square array; copy-and-past each row to a random batch of tests. This combines ideas from Figures \ref{['fig:horizontal']} and \ref{['fig:vertical']}.
Figure 2: How Section \ref{['sec:circle']} synthesizes corrupted words. Each $6$-digit number represents a symbol of $6\sqrt\nu$ bits. Each column is a codeword. Each row in Figure \ref{['fig:QR']} becomes a symbol here.
Figure 3: Left: folded Reed--solomon code in literature. Right: our customized Reed--Solomon code.
Figure 4: Left: a small GT scheme as a building block. Right: a large GT scheme built by disconnected, independent copies of $A$.
Figure 5: Left: Proposition \ref{['pro:serial']} can be seen as a construction based on a complete bipartite graph. Right: Proposition \ref{['pro:expander']} can be seen as a construction based on a "good" graph.
...and 11 more figures

Theorems & Definitions (31)

Theorem 1: Main theorem
Lemma 3: Birthdays are collision-free
proof
Lemma 4: Evaluation pairs are ample
proof
Proposition 5: Decoder is reliable
proof
Theorem 6: Toy Gacha
Proposition 7: Repeat to increase throughput
proof
...and 21 more

Nonadaptive Noise-Resilient Group Testing with Order-Optimal Tests and Fast-and-Reliable Decoding

TL;DR

Abstract

Nonadaptive Noise-Resilient Group Testing with Order-Optimal Tests and Fast-and-Reliable Decoding

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (16)

Theorems & Definitions (31)