Table of Contents
Fetching ...

Bounds for list-decoding and list-recovery of random linear codes

Venkatesan Guruswami, Ray Li, Jonathan Mosheiff, Nicolas Resch, Shashwat Silas, Mary Wootters

TL;DR

This work analyzes the output list size $L$ needed for random linear codes to achieve near-capacity list-decoding and list-recovery, including average-radius variants, over finite fields. It establishes a sharp separation between random linear codes and fully random codes for list-recovery, showing $L \,\ge\, \ell^{\Omega(1/\varepsilon)}$ contrast to $L=O(\ell/\varepsilon)$ for fully random codes, while proving that for list-decoding at rate $1-h_q(p)-\varepsilon$ the leading constant satisfies $L\ge\big\lfloor h_q(p)/\varepsilon+0.99\big\rfloor$, which is tight in the binary case. It also proves a matching upper bound for binary average-radius list-decoding: $L\le \big\lfloor h_2(p)/\varepsilon\rfloor+2$, thereby pinning the binary results to three possible values of $L$ and illustrating a sharp concentration phenomenon. The methods combine the MRRSW second-moment framework (to derive lower bounds via bad and abundant distributions) with potential-function arguments (to bound upper limits and prove concentration), and they reveal a close connection between zero-error list-recovery and erasure-list-decoding in the random-linear-code regime. Overall, the paper advances the understanding of list-size requirements near capacity and highlights fundamental differences between linear and fully random code ensembles.

Abstract

A family of error-correcting codes is list-decodable from error fraction $p$ if, for every code in the family, the number of codewords in any Hamming ball of fractional radius $p$ is less than some integer $L$ that is independent of the code length. It is said to be list-recoverable for input list size $\ell$ if for every sufficiently large subset of codewords (of size $L$ or more), there is a coordinate where the codewords take more than $\ell$ values. The parameter $L$ is said to be the "list size" in either case. The capacity, i.e., the largest possible rate for these notions as the list size $L \to \infty$, is known to be $1-h_q(p)$ for list-decoding, and $1-\log_q \ell$ for list-recovery, where $q$ is the alphabet size of the code family. In this work, we study the list size of random linear codes for both list-decoding and list-recovery as the rate approaches capacity. We show the following claims hold with high probability over the choice of the code (below, $ε> 0$ is the gap to capacity). (1) A random linear code of rate $1 - \log_q(\ell) - ε$ requires list size $L \ge \ell^{Ω(1/ε)}$ for list-recovery from input list size $\ell$. This is surprisingly in contrast to completely random codes, where $L = O(\ell/ε)$ suffices w.h.p. (2) A random linear code of rate $1 - h_q(p) - ε$ requires list size $L \ge \lfloor h_q(p)/ε+0.99 \rfloor$ for list-decoding from error fraction $p$, when $ε$ is sufficiently small. (3) A random binary linear code of rate $1 - h_2(p) - ε$ is list-decodable from average error fraction $p$ with list size with $L \leq \lfloor h_2(p)/ε\rfloor + 2$. The second and third results together precisely pin down the list sizes for binary random linear codes for both list-decoding and average-radius list-decoding to three possible values.

Bounds for list-decoding and list-recovery of random linear codes

TL;DR

This work analyzes the output list size needed for random linear codes to achieve near-capacity list-decoding and list-recovery, including average-radius variants, over finite fields. It establishes a sharp separation between random linear codes and fully random codes for list-recovery, showing contrast to for fully random codes, while proving that for list-decoding at rate the leading constant satisfies , which is tight in the binary case. It also proves a matching upper bound for binary average-radius list-decoding: , thereby pinning the binary results to three possible values of and illustrating a sharp concentration phenomenon. The methods combine the MRRSW second-moment framework (to derive lower bounds via bad and abundant distributions) with potential-function arguments (to bound upper limits and prove concentration), and they reveal a close connection between zero-error list-recovery and erasure-list-decoding in the random-linear-code regime. Overall, the paper advances the understanding of list-size requirements near capacity and highlights fundamental differences between linear and fully random code ensembles.

Abstract

A family of error-correcting codes is list-decodable from error fraction if, for every code in the family, the number of codewords in any Hamming ball of fractional radius is less than some integer that is independent of the code length. It is said to be list-recoverable for input list size if for every sufficiently large subset of codewords (of size or more), there is a coordinate where the codewords take more than values. The parameter is said to be the "list size" in either case. The capacity, i.e., the largest possible rate for these notions as the list size , is known to be for list-decoding, and for list-recovery, where is the alphabet size of the code family. In this work, we study the list size of random linear codes for both list-decoding and list-recovery as the rate approaches capacity. We show the following claims hold with high probability over the choice of the code (below, is the gap to capacity). (1) A random linear code of rate requires list size for list-recovery from input list size . This is surprisingly in contrast to completely random codes, where suffices w.h.p. (2) A random linear code of rate requires list size for list-decoding from error fraction , when is sufficiently small. (3) A random binary linear code of rate is list-decodable from average error fraction with list size with . The second and third results together precisely pin down the list sizes for binary random linear codes for both list-decoding and average-radius list-decoding to three possible values.

Paper Structure

This paper contains 28 sections, 15 theorems, 94 equations.

Key Result

Theorem 2.5

Let $R \in (0,1)$ and fix $\eta > 0$. Let $\tau$ be a $(1 - R - \eta)$-implicitly rare distribution over ${\mathbb F}_q^L$ ($L\in \mathbb{N}$), and let $\mathcal{C}$ be a random linear code of rate $R$. Then Conversely, suppose that $\tau$ is not $(1 - R + \eta)$-implicitly rare. Then

Theorems & Definitions (37)

  • Definition 2.1: List-recovery from erasures
  • Definition 2.2: $\tau_M$, $\dim(\tau)$, $\mathcal{M}_{n,\tau}$
  • Remark 2.3
  • Definition 2.4: $\gamma$-implicitly rare
  • Theorem 2.5: Follows from Lemma 2.7 in MRRSW
  • Theorem 3.1
  • Definition 3.2: The bad distribution $\tau$ for list-recovery lower bounds
  • Proposition 3.3: $\tau$ is bad
  • proof
  • Lemma 3.4: $\tau$ is abundant
  • ...and 27 more