Repairing Reed-Solomon Codes with Side Information
Thi Xinh Dinh, Ba Thong Le, Son Hoang Dau, Serdar Boztas, Stanislav Kruglik, Han Mao Kiah, Emanuele Viterbo, Tuvi Etzion, Yeow Meng Chee
TL;DR
Repairs of a single erased Reed-Solomon symbol are generalized to exploit side information modeled as an $\mathbb{F}_q$-subspace $\mathcal{S}$ of dimension $s$; the authors develop a trace-based repair framework and prove that the minimum repair bandwidth depends only on $s$ (not on the specific content of $S$), deriving a general lower bound and constructing optimal subspace-polynomial schemes in several parameter regimes. In the full-length setting with $n=q^{\ell}$ and $n-k=q^{m}$, the lower bound specializes to an explicit bandwidth $(q^{\ell}-1)(\ell-s)-\dfrac{(q^{\ell-s}-1)(q^{m}-1)}{q-1}$, and suitable choices of subspaces yield optimal schemes. The work reduces the repair problem with side information to a subspace-intersection optimization, linking bandwidth performance to geometric properties of subspace intersections. These results enable lower-cost repairs in distributed storage when side information is available and provide a concrete framework for designing bandwidth-efficient RS repairs under side information.
Abstract
We generalize the problem of recovering a lost/erased symbol in a Reed-Solomon code to the scenario in which some side information about the lost symbol is known. The side information is represented as a set $S$ of linearly independent combinations of the sub-symbols of the lost symbol. When $S = \varnothing$, this reduces to the standard problem of repairing a single codeword symbol. When $S$ is a set of sub-symbols of the erased one, this becomes the repair problem with partially lost/erased symbol. We first establish that the minimum repair bandwidth depends on $|S|$ and not the content of $S$ and construct a lower bound on the repair bandwidth of a linear repair scheme with side information $S$. We then consider the well-known subspace-polynomial repair schemes and show that their repair bandwidths can be optimized by choosing the right subspaces. Finally, we demonstrate several parameter regimes where the optimal bandwidths can be achieved for full-length Reed-Solomon codes.
