Optimal Erasure Codes and Codes on Graphs
Yeyuan Chen, Mahdi Cheraghchi, Nikhil Shagrithaya
TL;DR
This work presents constant-sized ensembles of linear codes over fixed alphabets that can correct a fraction $\delta$ of adversarial erasures at rates approaching the Singleton bound, with encoding and erasure decoding in near-linear time. It builds a unified framework tying erasure-code families to symbol-fixing extractors and condensers, yielding strongly explicit constructions for: (i) optimal linear codes on bipartite graphs, (ii) nearly-MDS codes over constant alphabets, and (iii) codes on non-bipartite graphs with improved rates. The key techniques combine code concatenation with randomness-efficient permutation of coordinates via seeds from extractors, enabling quasi-linear complexity and strong explicitness. The results advance explicit code constructions that approach capacity over fixed alphabets, with practical implications for distributed storage and fault-tolerant systems, while highlighting open questions about optimal rates for non-bipartite graph codes and seed-length optimizations for linear extractors.
Abstract
We construct constant-sized ensembles of linear error-correcting codes over any fixed alphabet that can correct a given fraction of adversarial erasures at rates approaching the Singleton bound arbitrarily closely. We provide several applications of our results: 1. Explicit constructions of strong linear seeded symbol-fixing extractors and lossless condensers, over any fixed alphabet, with only a constant seed length and optimal output lengths; 2. A strongly explicit construction of erasure codes on bipartite graphs (more generally, linear codes on matrices of arbitrary dimensions) with optimal rate and erasure-correction trade-offs; 3. A strongly explicit construction of erasure codes on non-bipartite graphs (more generally, linear codes on symmetric square matrices) achieving improved rates; 4. A strongly explicit construction of linear nearly-MDS codes over constant-sized alphabets that can be encoded and decoded in quasi-linear time.
