Table of Contents
Fetching ...

Random Reed-Solomon Codes Achieve the Half-Singleton Bound for Insertions and Deletions over Linear-Sized Alphabets

Roni Con, Zeyu Guo, Ray Li, Zihan Zhang

TL;DR

The paper proves that random Reed-Solomon codes over alphabets of size near linear in the code length can, with high probability, correct insdel errors up to (1−ε)n−2k+1, approaching the half-Singleton bound in the insertions-deletions setting. The authors develop a novel probabilistic framework that blends random matrix techniques from list-decoding with structural properties of longest common subsequences, introducing V-matrices and chain-based decompositions to bound rank under partial assignments. They achieve this with two regimes: quadratic alphabets via probabilistic certificates and linear alphabets via chain decomposition and banked certificates, yielding the near-optimal tradeoff with q = n + poly(1/ε)k and ultimately q = n + Θ(k) for linear alphabets. This work substantially reduces alphabet-size requirements compared to prior existential results and opens pathways to explicit constructions and decoding algorithms for insdel-correcting RS codes in practical settings.

Abstract

In this paper, we prove that with high probability, random Reed-Solomon codes approach the half-Singleton bound - the optimal rate versus error tradeoff for linear insdel codes - with linear-sized alphabets. More precisely, we prove that, for any $ε>0$ and positive integers $n$ and $k$, with high probability, random Reed--Solomon codes of length $n$ and dimension $k$ can correct $(1-\varepsilon)n-2k+1$ adversarial insdel errors over alphabets of size $n+2^{\mathsf{poly}(1/\varepsilon)}k$. This significantly improves upon the alphabet size demonstrated in the work of Con, Shpilka, and Tamo (IEEE TIT, 2023), who showed the existence of Reed--Solomon codes with exponential alphabet size $\widetilde O\left(\binom{n}{2k-1}^2\right)$ precisely achieving the half-Singleton bound. Our methods are inspired by recent works on list-decoding Reed-Solomon codes. Brakensiek-Gopi-Makam (STOC 2023) showed that random Reed-Solomon codes are list-decodable up to capacity with exponential-sized alphabets, and Guo-Zhang (FOCS 2023) and Alrabiah-Guruswami-Li (STOC 2024) improved the alphabet-size to linear. We achieve a similar alphabet-size reduction by similarly establishing strong bounds on the probability that certain random rectangular matrices are full rank. To accomplish this in our insdel context, our proof combines the random matrix techniques from list-decoding with structural properties of Longest Common Subsequences.

Random Reed-Solomon Codes Achieve the Half-Singleton Bound for Insertions and Deletions over Linear-Sized Alphabets

TL;DR

The paper proves that random Reed-Solomon codes over alphabets of size near linear in the code length can, with high probability, correct insdel errors up to (1−ε)n−2k+1, approaching the half-Singleton bound in the insertions-deletions setting. The authors develop a novel probabilistic framework that blends random matrix techniques from list-decoding with structural properties of longest common subsequences, introducing V-matrices and chain-based decompositions to bound rank under partial assignments. They achieve this with two regimes: quadratic alphabets via probabilistic certificates and linear alphabets via chain decomposition and banked certificates, yielding the near-optimal tradeoff with q = n + poly(1/ε)k and ultimately q = n + Θ(k) for linear alphabets. This work substantially reduces alphabet-size requirements compared to prior existential results and opens pathways to explicit constructions and decoding algorithms for insdel-correcting RS codes in practical settings.

Abstract

In this paper, we prove that with high probability, random Reed-Solomon codes approach the half-Singleton bound - the optimal rate versus error tradeoff for linear insdel codes - with linear-sized alphabets. More precisely, we prove that, for any and positive integers and , with high probability, random Reed--Solomon codes of length and dimension can correct adversarial insdel errors over alphabets of size . This significantly improves upon the alphabet size demonstrated in the work of Con, Shpilka, and Tamo (IEEE TIT, 2023), who showed the existence of Reed--Solomon codes with exponential alphabet size precisely achieving the half-Singleton bound. Our methods are inspired by recent works on list-decoding Reed-Solomon codes. Brakensiek-Gopi-Makam (STOC 2023) showed that random Reed-Solomon codes are list-decodable up to capacity with exponential-sized alphabets, and Guo-Zhang (FOCS 2023) and Alrabiah-Guruswami-Li (STOC 2024) improved the alphabet-size to linear. We achieve a similar alphabet-size reduction by similarly establishing strong bounds on the probability that certain random rectangular matrices are full rank. To accomplish this in our insdel context, our proof combines the random matrix techniques from list-decoding with structural properties of Longest Common Subsequences.
Paper Structure (21 sections, 29 theorems, 25 equations, 4 figures, 2 algorithms)

This paper contains 21 sections, 29 theorems, 25 equations, 4 figures, 2 algorithms.

Key Result

Theorem 1

Every linear insdel code which is capable of correcting a $\delta$ fraction of deletions has rate at most $(1-\delta)/2 + o(1)$.

Figures (4)

  • Figure 1: Example of chains
  • Figure 2: A decomposition of $(I,J)$, where $I=(3,4,6,7,8,9)$ and $J=(1,2,3,4,6,8)$, into maximal chains $(I_1,J_1)$ and $(I_2,J_2)$. Note that the maximal chains are $(I,J)$-disjoint but can interleave.
  • Figure 3: \ref{['lem:split']}: splitting long chains into short ones. Ensure that all chains have a length of at most $1/\varepsilon$ by removing at most $\varepsilon$ fraction of pairs from each chain.
  • Figure 4: \ref{['lem:nonsingular']}. Re-indeterminating a faulty (blue) chain with (part of) a (gray) chain from the bank. Three cases: (1) The faulty chain and the bank's chain are both type II with the same orientation. The new $V$-matrix is equivalent to the original one. (2) The faulty chain and the bank's (gray) chain are both type II with different orientations. The new $V$-matrix is equivalent to the original one when we view the bank's gray-chain in reverse. (3) The faulty chain is type I and the bank's chain is type II. The new $V$-matrix is not exactly equivalent to the original one, but is similar enough: since the old matrix is full rank (before setting $X_i$), the new matrix certainly is full rank as well, which is what we need.

Theorems & Definitions (67)

  • Theorem 1: Half-Singleton bound cheng2020efficient
  • Remark
  • Definition 2: Reed--Solomon code
  • Theorem 3: con2023reed
  • Theorem 4: Informal, Details in Theorem \ref{['main']}
  • Corollary : Corollary \ref{['cor:fullrankprob']}, informal
  • Definition 5: Partial assignment
  • Lemma 6
  • proof
  • Definition 7
  • ...and 57 more