Table of Contents
Fetching ...

Optimal Repair of $(k+2, k, 2)$ MDS Array Codes

Zihao Zhang, Guodong Li, Sihuang Hu

Abstract

Maximum distance separable (MDS) codes are widely used in distributed storage systems as they provide optimal fault tolerance for a given amount of storage overhead. The seminal work of Dimakis~\emph{et al.} first established a lower bound on the repair bandwidth for a single failed node of MDS codes, known as the \emph{cut-set bound}. MDS codes that achieve this bound are called minimum storage regenerating (MSR) codes. Numerous constructions and theoretical analyses of MSR codes reveal that they typically require exponentially large sub-packetization levels, leading to significant disk I/O overhead. To mitigate this issue, many studies explore the trade-offs between the sub-packetization level and repair bandwidth, achieving reduced sub-packetization at the cost of suboptimal repair bandwidth. Despite these advances, the fundamental question of determining the minimum repair bandwidth for a single failure of MDS codes with fixed sub-packetization remains open. In this paper, we address this challenge for the case of two parity nodes ($n-k=2$) and sub-packetization $\ell=2$. Under these parameters, we establish a correspondence between repair schemes and point sets on the projective line \(\mathbb{P}^1\), and then derive a lower bound on repair bandwidth utilizing the sharply 3-transitive action of \(\text{PGL}_2(\Fq)\). Furthermore, we extend this lower bound to the repair I/O, and construct two classes of explicit MDS array codes that achieve these bounds, offering practical code designs with provable repair efficiency.

Optimal Repair of $(k+2, k, 2)$ MDS Array Codes

Abstract

Maximum distance separable (MDS) codes are widely used in distributed storage systems as they provide optimal fault tolerance for a given amount of storage overhead. The seminal work of Dimakis~\emph{et al.} first established a lower bound on the repair bandwidth for a single failed node of MDS codes, known as the \emph{cut-set bound}. MDS codes that achieve this bound are called minimum storage regenerating (MSR) codes. Numerous constructions and theoretical analyses of MSR codes reveal that they typically require exponentially large sub-packetization levels, leading to significant disk I/O overhead. To mitigate this issue, many studies explore the trade-offs between the sub-packetization level and repair bandwidth, achieving reduced sub-packetization at the cost of suboptimal repair bandwidth. Despite these advances, the fundamental question of determining the minimum repair bandwidth for a single failure of MDS codes with fixed sub-packetization remains open. In this paper, we address this challenge for the case of two parity nodes () and sub-packetization . Under these parameters, we establish a correspondence between repair schemes and point sets on the projective line , and then derive a lower bound on repair bandwidth utilizing the sharply 3-transitive action of \(\text{PGL}_2(\Fq)\). Furthermore, we extend this lower bound to the repair I/O, and construct two classes of explicit MDS array codes that achieve these bounds, offering practical code designs with provable repair efficiency.

Paper Structure

This paper contains 10 sections, 19 theorems, 57 equations, 3 figures, 4 tables.

Key Result

Theorem 1

Let $\mathcal{C}$ be a $(k+2, k,2)$ MDS array code and $\beta_i$ the minimal repair bandwidth for each packet $C_i$ (where $1\le i\le n$). Then the avg-min repair bandwidth$\bar{\beta}(\mathcal{C}) := \frac{1}{n} \sum_{i=1}^{n}\beta_i$ satisfies that

Figures (3)

  • Figure 1: An $(n,k)=(6,4)$ RS-coded stripe.
  • Figure 2: An $(n,k,\ell)=(6,4,8)$ MSR-coded stripe.
  • Figure 3: An $(n,k,\ell)=(6,4,2)$ MDS-coded stripe.

Theorems & Definitions (37)

  • Theorem 1
  • Corollary 2
  • Theorem 3
  • Corollary 4
  • Lemma 5
  • proof
  • Lemma 6
  • proof
  • Definition 1: Total order $\prec$ on the power set $2^{[n]}$
  • Remark 1
  • ...and 27 more