Multi-access Distributed Computing Models from Map-Reduce Arrays

Shanuja Sasi; Onur Günlü; B. Sundar Rajan

Multi-access Distributed Computing Models from Map-Reduce Arrays

Shanuja Sasi, Onur Günlü, B. Sundar Rajan

TL;DR

This work introduces a unifying framework for multi-access distributed computing (MADC) by representing models with Map-Reduce Graphs (MRG) and Map-Reduce Arrays (MRA), enabling new, more scalable topologies than traditional combinatorial-Topologies (CT). It establishes how MRAs map to shuffling and reduction in MADC, and develops coding schemes that reduce communication while balancing computation, even when reducer counts become large. The paper defines Nearest Neighbor Connect-MRGs (NNC-MRG) and Generalized Combinatorial-MRGs (GC-MRG), showing that l-cyclic g-regular PDAs yield MRAs for these topologies, with concrete load expressions such as $L(r,\alpha)=\frac{(\Lambda-\alpha r)(\Lambda-(\alpha-1)r)}{\Lambda(\Lambda+(\alpha-1)r)}$ and $L(r)=\frac{1}{K}\sum_{\alpha}|...|$ in GC-MRGs. It also provides lower bounds and optimality conditions that connect to CT when certain parameters are chosen, demonstrating practical trade-offs between number of reducers, input files, and communication. Overall, MRAs offer a versatile toolkit to design coded shuffling schemes with flexible topology, potentially reducing resource requirements for large-scale Map-Reduce deployments.

Abstract

A novel distributed computing model called "Multi-access Distributed Computing (MADC)" was recently introduced in http://www.arXiv:2206.12851. In this paper, we represent MADC models via 2-layered bipartite graphs called Map-Reduce Graphs (MRGs) and a set of arrays called Map-Reduce Arrays (MRAs) inspired from the Placement Delivery Arrays (PDAs) used in the coded caching literature. The connection between MRAs and MRGs is established, thereby exploring new topologies and providing coded shuffling schemes for the MADC models with MRGs using the structure of MRAs. A novel \textit{Nearest Neighbor Connect-MRG (NNC-MRG)} is explored and a coding scheme is provided for MADC models with NNC-MRG, exploiting the connections between MRAs and PDAs. Moreover, CT is generalized to Generalized Combinatorial-MRG (GC-MRG). A set of $g-$regular MRAs is provided which corresponds to the existing scheme for MADC models with CT and extended those to generate another set of MRAs to represent MADC models with GC-MRG. A lower bound on the computation-communication curve for MADC model with GC-MRG under homogeneous setting is derived and certain cases are explored where the existing scheme is optimal under CT. One of the major limitations of the existing scheme for CT is that it requires an exponentially large number of reducer nodes and input files for large $Λ$. This can be overcome by representing CT by MRAs, where coding schemes can be derived even if some of the reducer nodes are not present. Another way of tackling this is by using a different MRG, specifically NNC-MRG, where the number of reducer nodes and files required are significantly smaller compared to CT. Hence, the advantages are two-fold, which is achievable at the expense of a slight increase in the communication load.

Multi-access Distributed Computing Models from Map-Reduce Arrays

TL;DR

and

in GC-MRGs. It also provides lower bounds and optimality conditions that connect to CT when certain parameters are chosen, demonstrating practical trade-offs between number of reducers, input files, and communication. Overall, MRAs offer a versatile toolkit to design coded shuffling schemes with flexible topology, potentially reducing resource requirements for large-scale Map-Reduce deployments.

Abstract

regular MRAs is provided which corresponds to the existing scheme for MADC models with CT and extended those to generate another set of MRAs to represent MADC models with GC-MRG. A lower bound on the computation-communication curve for MADC model with GC-MRG under homogeneous setting is derived and certain cases are explored where the existing scheme is optimal under CT. One of the major limitations of the existing scheme for CT is that it requires an exponentially large number of reducer nodes and input files for large

. This can be overcome by representing CT by MRAs, where coding schemes can be derived even if some of the reducer nodes are not present. Another way of tackling this is by using a different MRG, specifically NNC-MRG, where the number of reducer nodes and files required are significantly smaller compared to CT. Hence, the advantages are two-fold, which is achievable at the expense of a slight increase in the communication load.

Paper Structure (20 sections, 2 theorems, 75 equations, 5 figures, 2 tables, 2 algorithms)

This paper contains 20 sections, 2 theorems, 75 equations, 5 figures, 2 tables, 2 algorithms.

introduction
Problem Definition
BE Scheme BP
Placement Delivery Array YCTCPDA
Map-Reduce Graphs and Map-Reduce Arrays
Topologies from MRAs
Nearest Neighbor Connect-MRG
Generalized Combinatorial-MRG
conclusion
Proof of Theorem \ref{['thm1']}
Shuffle Phase
Reduce Phase
Proof of Theorem \ref{['thm2']}
MRAs for MADC Models with NNC-MRG
Proof of correctness of Algorithm \ref{['algo1']}
...and 5 more sections

Key Result

Corollary 1

For a fixed $\alpha \in [\Lambda]$ and $K_{\alpha}=1$, the GC-MRG reduces to CT. For MADC models with CT, when $\alpha =1$, each reducer node is assigned exactly one unique mapper node which corresponds to the original DC model LMA. Hence, we have $K=\Lambda$. For this setting, our lower bound in Th

Figures (5)

Figure 1: MRG for MADC model consisting of $3$ batches of files, mapper nodes and reducer nodes with $r=1$ and each reducer node connected to $2$ mapper nodes.
Figure 2: MRG for MADC model consisting of $3$ batches of files, mapper nodes and reducer nodes with $r=2$ and each reducer node connected to $1$ mapper nodes.
Figure 3: MRG for MADC model consisting of $3$ batches of files, $6$ mapper nodes and $3$ reducer nodes with $r=2$ and each reducer node connected to $2$ mapper nodes.
Figure 4: MADC model corresponding to Example \ref{['exmp1']} with $4$ mapper nodes, $5$ reducer nodes.
Figure :

Theorems & Definitions (33)

Definition 1
Definition 2
Example 1
Definition 3
Example 2
Definition 4
Example 3
Definition 5
Example 4
Definition 6
...and 23 more

Multi-access Distributed Computing Models from Map-Reduce Arrays

TL;DR

Abstract

Multi-access Distributed Computing Models from Map-Reduce Arrays

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (33)