Bounds for list-decoding and list-recovery of random linear codes
Venkatesan Guruswami, Ray Li, Jonathan Mosheiff, Nicolas Resch, Shashwat Silas, Mary Wootters
TL;DR
This work analyzes the output list size $L$ needed for random linear codes to achieve near-capacity list-decoding and list-recovery, including average-radius variants, over finite fields. It establishes a sharp separation between random linear codes and fully random codes for list-recovery, showing $L \,\ge\, \ell^{\Omega(1/\varepsilon)}$ contrast to $L=O(\ell/\varepsilon)$ for fully random codes, while proving that for list-decoding at rate $1-h_q(p)-\varepsilon$ the leading constant satisfies $L\ge\big\lfloor h_q(p)/\varepsilon+0.99\big\rfloor$, which is tight in the binary case. It also proves a matching upper bound for binary average-radius list-decoding: $L\le \big\lfloor h_2(p)/\varepsilon\rfloor+2$, thereby pinning the binary results to three possible values of $L$ and illustrating a sharp concentration phenomenon. The methods combine the MRRSW second-moment framework (to derive lower bounds via bad and abundant distributions) with potential-function arguments (to bound upper limits and prove concentration), and they reveal a close connection between zero-error list-recovery and erasure-list-decoding in the random-linear-code regime. Overall, the paper advances the understanding of list-size requirements near capacity and highlights fundamental differences between linear and fully random code ensembles.
Abstract
A family of error-correcting codes is list-decodable from error fraction $p$ if, for every code in the family, the number of codewords in any Hamming ball of fractional radius $p$ is less than some integer $L$ that is independent of the code length. It is said to be list-recoverable for input list size $\ell$ if for every sufficiently large subset of codewords (of size $L$ or more), there is a coordinate where the codewords take more than $\ell$ values. The parameter $L$ is said to be the "list size" in either case. The capacity, i.e., the largest possible rate for these notions as the list size $L \to \infty$, is known to be $1-h_q(p)$ for list-decoding, and $1-\log_q \ell$ for list-recovery, where $q$ is the alphabet size of the code family. In this work, we study the list size of random linear codes for both list-decoding and list-recovery as the rate approaches capacity. We show the following claims hold with high probability over the choice of the code (below, $ε> 0$ is the gap to capacity). (1) A random linear code of rate $1 - \log_q(\ell) - ε$ requires list size $L \ge \ell^{Ω(1/ε)}$ for list-recovery from input list size $\ell$. This is surprisingly in contrast to completely random codes, where $L = O(\ell/ε)$ suffices w.h.p. (2) A random linear code of rate $1 - h_q(p) - ε$ requires list size $L \ge \lfloor h_q(p)/ε+0.99 \rfloor$ for list-decoding from error fraction $p$, when $ε$ is sufficiently small. (3) A random binary linear code of rate $1 - h_2(p) - ε$ is list-decodable from average error fraction $p$ with list size with $L \leq \lfloor h_2(p)/ε\rfloor + 2$. The second and third results together precisely pin down the list sizes for binary random linear codes for both list-decoding and average-radius list-decoding to three possible values.
