Coded Distributed (Batch) Matrix Multiplication over Galois Ring via RMFE
Yi Kuang, Jiang Li, Songsong Li, Chaoping Xing
TL;DR
This work advances coded distributed matrix multiplication by moving from large GF fields to Galois rings GR$(p^e,d)$ and employing Reverse Multiplication Friendly Embedding (RMFE) to enable efficient interpolation for large distributed orders. It introduces a general RMFE-based framework for batched matrix multiplication, including Batch-$\widetilde{\mathsf{EP}}_{\mathsf{RMFE}}$, which achieves a recovery-threshold reduction relative to GCSA, and two Single CDMM constructions, $\widetilde{\mathsf{EP}}_{\mathsf{RMFE}}$-I and II, that optimize encoding/upload or decoding/download costs respectively while maintaining near-EP performance over small fields. The paper provides explicit costs for encoding, decoding, and per-worker computation, and demonstrates substantial gains through experiments on $\mathbb{Z}_{2^{64}}$ with varying numbers of workers and extension degrees. Overall, the RMFE-based approach enables practical, hardware-friendly CDMM over Galois rings, offering scalable performance for batch and single matrix multiplications and laying groundwork for extensions to secure and private CDMM.
Abstract
Coded Distributed Matrix Multiplication (CDMM) is a distributed matrix multiplication (DMM) for large-scale matrices through a coding scheme such that any $R$ worker node among all $N$ worker nodes can recover the final product, where $N$ corresponds to the length of the code and $R\leq N$ is called the recovery threshold. The state-of-art CDMM schemes, such as EP codes for Single DMM and GCAS codes for batch DMM, are defined over a Galois field $\mathsf{GF}(q)$ of size $q\geq N$. These are inefficient for small Galois fields such as $\mathsf{GF}(2)$ and the integer residue ring $\mathbb{Z}_{p^{e}}$ due to the lack of invertible elements for interpolation. DMM over $\mathbb{Z}_{p^{e}}$ (such as $\mathbb{Z}_{2^{64}}$ ) is well-motivated in practice due to their direct compatibility with hardware. In this work, we construct efficient CDMM over the Galois ring $\mathsf{GR}(p^e,d)$ which is an extension ring over $\mathbb{Z}_{p^{e}}$ of degree $d$, particularly, $\mathsf{GR}(p,d)=\mathsf{GF}(p^d)$ is the Galois field and $\mathsf{GR}(p^e,1)=\mathbb{Z}_{p^e}$. We first give a general CDMM framework for the batch of $n$ matrix multiplications via the famous RMFE (Cascudo et al. Crypto'18). Compared with GCSA, our construction has a smaller recovery threshold by a factor of $1/n$. Next, we optimize EP codes via batch preprocessing of the input matrices. We give two types of Single CDMM, which can achieve almost the same performance as EP codes over a Galois field with size $q\geq N$. Finally, we present the experimental analysis of our CDMM on Galois rings.
