Rack-Aware MSR Codes with Linear Field Size and Smaller Sub-Packetization for Tolerating Multiple Erasures
Hengming Zhao, Dianhua Wu, Minquan Cheng
TL;DR
This work addresses efficient repair in rack-aware distributed storage by constructing two explicit rack-aware MSR code families that tolerate multiple erasures within a host rack. Leveraging a coupled-layer alignment technique and kernel-map-based parity design, the authors achieve small sub-packetization levels with linear field size, while preserving optimal or near-optimal repair bandwidth for a range of failed-node counts h. The first construction attains l=ar{s}^{ ceil ar{n}/ar{s} ceil} with optimal bandwidth for 1≤h≤u−v and asymptotic optimality for higher h, plus optimal access at h=u−v; the second construction further reduces sub-packetization to l=ar{s}^{ ceil ar{n}/(ar{s}+1) ceil} while maintaining similar bandwidth guarantees. The results advance practical rack-aware MSR coding by enabling efficient multi-node repair with modest field sizes, addressing both bandwidth and access constraints in hierarchical data-center architectures.
Abstract
In an $(n,k,d)$ rack-aware storage model, the system consists of $n$ nodes uniformly distributed across $\bar{n}$ successive racks, such that each rack contains $u$ nodes of equal capacity and the reconstructive degree satisfies $k=\bar{k}u+v$ where $0\leq v\leq u-1$. Suppose there are $h\geq1$ failed nodes in a rack (called the host rack). Then together with its surviving nodes, the host rack downloads recovery data from $\bar{d}$ helper racks and repairs its failed nodes. In this paper, we focus on studying the rack-aware minimum storage generating (MSR) codes for repairing $h$ failed nodes within the same rack. By using the coupled-layer construction with the alignment technique, we construct the first class of rack-aware MSR codes for all $\bar{k}+1\leq\bar{d}\leq\bar{n}-1$ which achieve the small sub-packetization $l=\bar{s}^{\lceil\bar{n}/\bar{s}\rceil}$ where the field size $q$ increases linearly with $n$ and $\bar{s}=\bar{d}-\bar{k}+1$. In addition, these codes achieve optimal repair bandwidth for $1\leq h\leq u-v$, and asymptotically optimal repair bandwidth for $u-v+1\leq h\leq u$. In particular, they achieve optimal access when $h=u-v$. It is worth noting that the existing rack-aware MSR codes which achieve the same sub-packetization $l=\bar{s}^{\lceil\bar{n}/\bar{s}\rceil}$ are only known for the special case of $\bar{d}=\bar{n}-1$, $h=1$, and the field size is much larger than ours. Then, based on our first construction we further develop another class of explicit rack-aware MSR codes with even smaller sub-packetization $l=\bar{s}^{\lceil\bar{n}/(\bar{s}+1)\rceil}$ for all admissible values of $\bar{d}$.
