Batch Array Codes
Xiangliang Kong, Chen Wang, Yiwei Zhang
TL;DR
This work investigates Batch Array Codes ($\text{BACs}$), an array-based analogue of batch codes for coded distributed storage and PIR, showing that allowing local computation at each bucket can reduce storage overhead while preserving batch recovery properties. It develops information-theoretic lower bounds on the minimum total bucket length $N$ for $(n,N,k,m)$-BACs and provides tighter bounds for parameter regimes $k<m<2k$ and $m=k+2$. The authors present three explicit BAC constructions—cyclic shifted sets, good-vectors CPIR-based designs, and a random systematic BAC—each achieving low redundancy in different regimes, along with a gadget-like method to combine constructs. Overall, the results illuminate the storage-performance trade-offs in BACs and identify directions for tighter bounds and broader uniform constructions in future work.
Abstract
Batch codes are a type of codes specifically designed for coded distributed storage systems and private information retrieval protocols. These codes have got much attention in recent years due to their ability to enable efficient and secure storage in distributed systems. In this paper, we study an array code version of the batch codes, which is called the \emph{batch array code} (BAC). Under the setting of BAC, each node stores a bucket containing multiple code symbols and responds with a locally computed linear combination of the symbols in its bucket during the recovery of a requested symbol. We demonstrate that BACs can support the same type of requests as the original batch codes but with reduced redundancy. Specifically, we establish information theoretic lower bounds on the code lengths and provide several code constructions that confirm the tightness of the lower bounds for certain parameter regimes.
