Table of Contents
Fetching ...

Optimal Erasure Codes and Codes on Graphs

Yeyuan Chen, Mahdi Cheraghchi, Nikhil Shagrithaya

TL;DR

This work presents constant-sized ensembles of linear codes over fixed alphabets that can correct a fraction $\delta$ of adversarial erasures at rates approaching the Singleton bound, with encoding and erasure decoding in near-linear time. It builds a unified framework tying erasure-code families to symbol-fixing extractors and condensers, yielding strongly explicit constructions for: (i) optimal linear codes on bipartite graphs, (ii) nearly-MDS codes over constant alphabets, and (iii) codes on non-bipartite graphs with improved rates. The key techniques combine code concatenation with randomness-efficient permutation of coordinates via seeds from extractors, enabling quasi-linear complexity and strong explicitness. The results advance explicit code constructions that approach capacity over fixed alphabets, with practical implications for distributed storage and fault-tolerant systems, while highlighting open questions about optimal rates for non-bipartite graph codes and seed-length optimizations for linear extractors.

Abstract

We construct constant-sized ensembles of linear error-correcting codes over any fixed alphabet that can correct a given fraction of adversarial erasures at rates approaching the Singleton bound arbitrarily closely. We provide several applications of our results: 1. Explicit constructions of strong linear seeded symbol-fixing extractors and lossless condensers, over any fixed alphabet, with only a constant seed length and optimal output lengths; 2. A strongly explicit construction of erasure codes on bipartite graphs (more generally, linear codes on matrices of arbitrary dimensions) with optimal rate and erasure-correction trade-offs; 3. A strongly explicit construction of erasure codes on non-bipartite graphs (more generally, linear codes on symmetric square matrices) achieving improved rates; 4. A strongly explicit construction of linear nearly-MDS codes over constant-sized alphabets that can be encoded and decoded in quasi-linear time.

Optimal Erasure Codes and Codes on Graphs

TL;DR

This work presents constant-sized ensembles of linear codes over fixed alphabets that can correct a fraction of adversarial erasures at rates approaching the Singleton bound, with encoding and erasure decoding in near-linear time. It builds a unified framework tying erasure-code families to symbol-fixing extractors and condensers, yielding strongly explicit constructions for: (i) optimal linear codes on bipartite graphs, (ii) nearly-MDS codes over constant alphabets, and (iii) codes on non-bipartite graphs with improved rates. The key techniques combine code concatenation with randomness-efficient permutation of coordinates via seeds from extractors, enabling quasi-linear complexity and strong explicitness. The results advance explicit code constructions that approach capacity over fixed alphabets, with practical implications for distributed storage and fault-tolerant systems, while highlighting open questions about optimal rates for non-bipartite graph codes and seed-length optimizations for linear extractors.

Abstract

We construct constant-sized ensembles of linear error-correcting codes over any fixed alphabet that can correct a given fraction of adversarial erasures at rates approaching the Singleton bound arbitrarily closely. We provide several applications of our results: 1. Explicit constructions of strong linear seeded symbol-fixing extractors and lossless condensers, over any fixed alphabet, with only a constant seed length and optimal output lengths; 2. A strongly explicit construction of erasure codes on bipartite graphs (more generally, linear codes on matrices of arbitrary dimensions) with optimal rate and erasure-correction trade-offs; 3. A strongly explicit construction of erasure codes on non-bipartite graphs (more generally, linear codes on symmetric square matrices) achieving improved rates; 4. A strongly explicit construction of linear nearly-MDS codes over constant-sized alphabets that can be encoded and decoded in quasi-linear time.

Paper Structure

This paper contains 28 sections, 31 theorems, 14 equations, 2 figures.

Key Result

Theorem 1

For any $\delta \in [0,1)$, $\eta>0$, prime power $q$, and large enough $N$, there is a strongly explicit construction of an ensemble of linear codes of length $N$ over $\mathds{F}_q$ of rate at least $1-\delta-\eta$ such that any pattern of up to $\delta N$ erasures can be corrected by all but up t

Figures (2)

  • Figure 1: Construction of the erasure code family in \ref{['sec:codes']} from the decoder's perspective (codeword at the top, decoding at the bottom). The function ${\mathsf{Ext}}\colon [N] \times [D] \to [M]$ is a strong $(\log N - \Delta, \nu)$-extractor for $\Delta=-\log(1-\delta)$ and $\nu = O(\epsilon \eta^2)$. The inner code family $\mathfrak{C}_{\mathsf{in}}$ is an $[L, \delta+2\eta, \mu]_q$-erasure code family for $\mu = O(\epsilon \eta)$. The construction contains a code for each choice of $(z,\mathcal{C}_{\mathsf{in}}) \in [D] \times \mathfrak{C}_{\mathsf{in}}$. The extractor assigns codeword positions to outer code blocks, in order. Occasionally, this causes overfull blocks, in which case the corresponding codeword position is frozen to zero (as depicted).
  • Figure 2: Construction of the bipartite graph codes in \ref{['sec:graphs']}. The row-erasure correction code $\mathcal{C}_{\mathsf{row}}$ (of alphabet size $q^{\ell_0}$) is bundled to provide a sufficient number $\ell$ of columns. Then, each row of the matrix consisting of codewords of $\mathcal{C}_{\mathsf{row}}$ is encoded by a codeword from the erasure code family $\mathfrak{C}$ to provide column-erasure correction.

Theorems & Definitions (52)

  • Theorem 1: \ref{['coro:explicit']}, Simplified
  • Corollary 2: \ref{['coro:explicit:bitfixing']}, Simplified
  • Corollary 3: \ref{['coro:matrix:explicit']}, Simplified
  • Theorem 4: \ref{['thm:almost:MDS:improve']}, Simplified
  • Theorem 5: \ref{['thm:graph:code:explicit']}, Simplified
  • Definition 6
  • Theorem 7
  • Proposition 8
  • Definition 9
  • Definition 10
  • ...and 42 more