Tight Lower Bounds on the Bandwidth Cost of MDS Convertible Codes in the Split Regime
Shubhransh Singhvi, Saransh Chopra, K. V. Rashmi
TL;DR
This work addresses the bandwidth cost of converting data between MDS codes in distributed storage, focusing on systematic MDS convertible codes in the split regime where $k^{I} = \lambda^{F} k^{F}$ with ${\lambda^{F}} \ge 2$. It introduces an information-theoretic framework that eliminates linearity and uniform-per-node-download assumptions, and derives a sequence of lower bounds on the conversion bandwidth $\gamma_{\mathrm{R}}$, proving tightness in several parameter regions. The central contributions include universal lower bounds for all parameters in the split regime, tight results when ${r^{F}} \ge k^{F}$ or ${r^{I}} \le k^{F}$, and partial resolution of the conjecture by Maturana and Rashmi. These results advance understanding of bandwidth-efficient code conversion, informing practical design for adaptive redundancy in distributed storage systems.
Abstract
Recent advances in erasure coding for distributed storage systems have demonstrated that adapting redundancy to varying disk failure rates can lead to substantial storage savings. Such adaptation requires code conversion, wherein data encoded under an initial $[k^I + r^I, k^I]$ code is transformed into data encoded under a final $[k^F + r^F, k^F]$ code - an operation that can be resource-intensive. Convertible codes are a class of codes designed to facilitate this transformation efficiently while preserving desirable properties such as the MDS property. In this work, we investigate the fundamental limits on the bandwidth cost of conversion (total amount of data transferred between the storage nodes during conversion) for systematic MDS convertible codes. Specifically, we study the subclass of conversions known as the split regime (a single initial codeword is converted into multiple final codewords). In this setting, prior to this work, the best known lower bounds on the bandwidth cost of conversion for all parameters were derived by Maturana and Rashmi under certain uniformity assumptions on the number of symbols downloaded from each node. Further, these bounds were shown to be tight for the parameter regime where $r^F \geq k^F$ or $r^I \leq r^F$. In this work, we derive lower bounds on the bandwidth cost of systematic MDS convertible codes for all parameters in the split regime without the uniformity assumption. Moreover, our bounds are tight for the broader parameter regime where $r^F \geq k^F$ or $r^I \leq k^F$. Subsequently, our bounds also partially resolve the conjecture proposed by Maturana and Rashmi. We employ a novel information-theoretic framework, which assumes only that the initial and final codes are systematic and does not rely on any linearity assumptions or the aforementioned uniformity assumptions.
