Batch Array Codes

Xiangliang Kong; Chen Wang; Yiwei Zhang

Batch Array Codes

Xiangliang Kong, Chen Wang, Yiwei Zhang

TL;DR

This work investigates Batch Array Codes ($\text{BACs}$), an array-based analogue of batch codes for coded distributed storage and PIR, showing that allowing local computation at each bucket can reduce storage overhead while preserving batch recovery properties. It develops information-theoretic lower bounds on the minimum total bucket length $N$ for $(n,N,k,m)$-BACs and provides tighter bounds for parameter regimes $k<m<2k$ and $m=k+2$. The authors present three explicit BAC constructions—cyclic shifted sets, good-vectors CPIR-based designs, and a random systematic BAC—each achieving low redundancy in different regimes, along with a gadget-like method to combine constructs. Overall, the results illuminate the storage-performance trade-offs in BACs and identify directions for tighter bounds and broader uniform constructions in future work.

Abstract

Batch codes are a type of codes specifically designed for coded distributed storage systems and private information retrieval protocols. These codes have got much attention in recent years due to their ability to enable efficient and secure storage in distributed systems. In this paper, we study an array code version of the batch codes, which is called the \emph{batch array code} (BAC). Under the setting of BAC, each node stores a bucket containing multiple code symbols and responds with a locally computed linear combination of the symbols in its bucket during the recovery of a requested symbol. We demonstrate that BACs can support the same type of requests as the original batch codes but with reduced redundancy. Specifically, we establish information theoretic lower bounds on the code lengths and provide several code constructions that confirm the tightness of the lower bounds for certain parameter regimes.

Batch Array Codes

TL;DR

This work investigates Batch Array Codes (

), an array-based analogue of batch codes for coded distributed storage and PIR, showing that allowing local computation at each bucket can reduce storage overhead while preserving batch recovery properties. It develops information-theoretic lower bounds on the minimum total bucket length

for

-BACs and provides tighter bounds for parameter regimes

and

. The authors present three explicit BAC constructions—cyclic shifted sets, good-vectors CPIR-based designs, and a random systematic BAC—each achieving low redundancy in different regimes, along with a gadget-like method to combine constructs. Overall, the results illuminate the storage-performance trade-offs in BACs and identify directions for tighter bounds and broader uniform constructions in future work.

Abstract

Paper Structure (12 sections, 26 theorems, 100 equations, 5 tables)

This paper contains 12 sections, 26 theorems, 100 equations, 5 tables.

Introduction
Notations and preliminaries
Lower bounds on the code length of BACs and PIR array codes
A general lower bound
An improved bound for $k<m<2k$
An improved bound for $m=k+2$
Constructions of batch array codes with low redundancy
A construction through cyclic shifted sets
A construction through "good vectors"
A random construction of systematic BACs
Conclusion and further research
Proof of Theorem \ref{['Thm: random_cons for BAC']}

Key Result

Lemma 2.4

Let $\mathcal{C}$ be an $(n, N, k, m)$-PIR array code over ${{\mathbb F}}_{\!q}$. Then, for any $\mathbf{x}=(x_1,\ldots,x_n)\in {{\mathbb F}}_{\!q}^n$ and any subset $R\subseteq [m]$ of size $m-k+1$, $R$ contains a recovery set of $x_i$ for every $i\in [n]$.

Theorems & Definitions (60)

Example 1
Definition 2.1
Definition 2.2
Definition 2.3
Lemma 2.4
proof
Lemma 2.5
proof
Example 2
Theorem 3.1
...and 50 more

Batch Array Codes

TL;DR

Abstract

Batch Array Codes

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (60)