Multi-access Distributed Computing Models from Map-Reduce Arrays
Shanuja Sasi, Onur Günlü, B. Sundar Rajan
TL;DR
This work introduces a unifying framework for multi-access distributed computing (MADC) by representing models with Map-Reduce Graphs (MRG) and Map-Reduce Arrays (MRA), enabling new, more scalable topologies than traditional combinatorial-Topologies (CT). It establishes how MRAs map to shuffling and reduction in MADC, and develops coding schemes that reduce communication while balancing computation, even when reducer counts become large. The paper defines Nearest Neighbor Connect-MRGs (NNC-MRG) and Generalized Combinatorial-MRGs (GC-MRG), showing that l-cyclic g-regular PDAs yield MRAs for these topologies, with concrete load expressions such as $L(r,\alpha)=\frac{(\Lambda-\alpha r)(\Lambda-(\alpha-1)r)}{\Lambda(\Lambda+(\alpha-1)r)}$ and $L(r)=\frac{1}{K}\sum_{\alpha}|...|$ in GC-MRGs. It also provides lower bounds and optimality conditions that connect to CT when certain parameters are chosen, demonstrating practical trade-offs between number of reducers, input files, and communication. Overall, MRAs offer a versatile toolkit to design coded shuffling schemes with flexible topology, potentially reducing resource requirements for large-scale Map-Reduce deployments.
Abstract
A novel distributed computing model called "Multi-access Distributed Computing (MADC)" was recently introduced in http://www.arXiv:2206.12851. In this paper, we represent MADC models via 2-layered bipartite graphs called Map-Reduce Graphs (MRGs) and a set of arrays called Map-Reduce Arrays (MRAs) inspired from the Placement Delivery Arrays (PDAs) used in the coded caching literature. The connection between MRAs and MRGs is established, thereby exploring new topologies and providing coded shuffling schemes for the MADC models with MRGs using the structure of MRAs. A novel \textit{Nearest Neighbor Connect-MRG (NNC-MRG)} is explored and a coding scheme is provided for MADC models with NNC-MRG, exploiting the connections between MRAs and PDAs. Moreover, CT is generalized to Generalized Combinatorial-MRG (GC-MRG). A set of $g-$regular MRAs is provided which corresponds to the existing scheme for MADC models with CT and extended those to generate another set of MRAs to represent MADC models with GC-MRG. A lower bound on the computation-communication curve for MADC model with GC-MRG under homogeneous setting is derived and certain cases are explored where the existing scheme is optimal under CT. One of the major limitations of the existing scheme for CT is that it requires an exponentially large number of reducer nodes and input files for large $Λ$. This can be overcome by representing CT by MRAs, where coding schemes can be derived even if some of the reducer nodes are not present. Another way of tackling this is by using a different MRG, specifically NNC-MRG, where the number of reducer nodes and files required are significantly smaller compared to CT. Hence, the advantages are two-fold, which is achievable at the expense of a slight increase in the communication load.
