Trace Repair Never Loses to Classical Repair: Exact and Explicit Helper Nodes Selection
Wilton Kim, Stanislav Kruglik, Han Mao Kiah
TL;DR
This paper tackles the problem of bandwidth-efficient single-erasure repair for Reed-Solomon codes in distributed storage by unifying the Guruswami–Wootters trace-repair framework with Lin's zero-forcing and Liu et al.'s subspace dependencies. It introduces Trace-Repair Compatible polynomials and establishes an exact dimension relationship with the subspace $\mathcal{W}_{k,\mathcal{S}}$, enabling an explicit optimization (Algorithm O) over the exclusion set $\mathcal{S}$ and cyclotomic cosets. The authors provide an explicit construction of helper nodes and prove a universal bandwidth bound of $k\log|\mathbb{F}|$ bits for trace-repair applicable scenarios, improving over classical repair for all $k\le n-q^{t-1}$. Their contributions include precise dimension formulas, an optimization framework, and practical procedures that outperform state-of-the-art schemes across several finite-field regimes. Overall, the work delivers both tight theoretical guarantees and implementable repair protocols for RS codes in distributed storage.
Abstract
Repairing Reed-Solomon codes with low bandwidth is a central challenge in distributed storage. Following the trace-repair framework of Guruswami and Wootters (2017), recent works by Lin (2023) and Liu-Wan-Xing (2024) provided significant improvements in bandwidth using two distinct ideas. Lin constructed a trace-repair scheme that requires no contribution from a set of predetermined nodes $\mathscr{S}$, while Liu-Wan-Xing identified linear dependencies among the downloaded traces, relating the number of dependent traces to the dimension of a subspace $\mathscr{W}_k$. In this work, we fully utilize and unify these ideas. We compute the exact dimension of $\mathscr{W}_{k,\mathscr{S}}$ (a generalization of $\mathscr{W}_k$). We identify the trade-off between the set size $|\mathscr{S}|$ and the dimension $\dim(\mathscr{W}_{k,\mathscr{S}})$. We provide an algorithm to find the combination that results in the lowest bandwidth. Furthermore, we provide an explicit choice of the helper nodes for the repair. Finally, we prove that our optimized scheme never loses to the classical repair scheme, establishing a bandwidth guarantee of at most $k\log|\mathbb{F}|$ bits for all dimension $k$ and field $\mathbb{F}$, whenever the trace repair is applicable.
