Table of Contents
Fetching ...

On the Optimality of Coded Distributed Computing for Ring Networks

Zhenhao Huang, Minquan Cheng, Kai Wan, Qifu Tyler Sun, Youlong Wu

TL;DR

An information-theoretic lower bound on the optimal communication load is derived and it is shown that the proposed coded scheme is asymptotically optimal under the cyclic placement when $N\gg r$.

Abstract

We consider a coded distributed computing problem in a ring-based communication network, where $N$ computing nodes are arranged in a ring topology and each node can only communicate with its neighbors within a constant distance $d$. To mitigate the communication bottleneck in exchanging intermediate values, we propose new coded distributed computing schemes for the ring-based network that exploit both ring topology and redundant computation (i.e., each map function is computed by $r$ nodes). Two typical cases are considered: all-gather where each node requires all intermediate values mapped from all input files, and all-to-all where each node requires a distinct set of intermediate values from other nodes. For the all-gather case, we propose a new coded scheme based on successive reverse carpooling where nodes transmit every encoded packet containing two messages traveling in opposite directions along the same path. Theoretical converse proof shows that our scheme achieves the optimal tradeoff between communication load, computation load $r$, and broadcast distance $d$ when $N\gg d$. For the all-to-all case, instead of simply repeating our all-gather scheme, we delicately deliver intermediate values based on their proximity to intended nodes to reduce unnecessary transmissions. We derive an information-theoretic lower bound on the optimal communication load and show that our scheme is asymptotically optimal under the cyclic placement when $N\gg r$. The optimality results indicate that in ring-based networks, the redundant computation $r$ only leads to an additive gain in reducing communication load while the broadcast distance $d$ contributes to a multiplicative gain.

On the Optimality of Coded Distributed Computing for Ring Networks

TL;DR

An information-theoretic lower bound on the optimal communication load is derived and it is shown that the proposed coded scheme is asymptotically optimal under the cyclic placement when .

Abstract

We consider a coded distributed computing problem in a ring-based communication network, where computing nodes are arranged in a ring topology and each node can only communicate with its neighbors within a constant distance . To mitigate the communication bottleneck in exchanging intermediate values, we propose new coded distributed computing schemes for the ring-based network that exploit both ring topology and redundant computation (i.e., each map function is computed by nodes). Two typical cases are considered: all-gather where each node requires all intermediate values mapped from all input files, and all-to-all where each node requires a distinct set of intermediate values from other nodes. For the all-gather case, we propose a new coded scheme based on successive reverse carpooling where nodes transmit every encoded packet containing two messages traveling in opposite directions along the same path. Theoretical converse proof shows that our scheme achieves the optimal tradeoff between communication load, computation load , and broadcast distance when . For the all-to-all case, instead of simply repeating our all-gather scheme, we delicately deliver intermediate values based on their proximity to intended nodes to reduce unnecessary transmissions. We derive an information-theoretic lower bound on the optimal communication load and show that our scheme is asymptotically optimal under the cyclic placement when . The optimality results indicate that in ring-based networks, the redundant computation only leads to an additive gain in reducing communication load while the broadcast distance contributes to a multiplicative gain.

Paper Structure

This paper contains 24 sections, 71 equations, 13 figures, 6 tables, 2 algorithms.

Figures (13)

  • Figure 1: Ring network with $\mathsf{N}$ nodes, computation load $r=1$ and broadcast distance $d=2$.
  • Figure 2: Two computing scenarios when $r=1$: (a) All-Gather and (b) All-to-All.
  • Figure 3: (a) Reverse carpooling with $3$ nodes. (b) Reverse carpooling for two flows. $P_a^{(t)}$ and $P_b^{(t)}$ are the packets sent by nodes $n_1$ and $n_4$ at time clock $t$, respectively. At clock $t$, node $n_2$ broadcasts $P_a^{(t-1)}\oplus P_b^{(t-2)}$, and node $n_3$ broadcasts $P_a^{(t-2)}\oplus P_b^{(t-1)}$. It effectively enables the two flows to traverse a common path without interfering with each other.
  • Figure 4: Efficient broadcasting over ring network with $\mathsf{N}=8$ nodes, computation load $r=1$ and broadcast distance $d=1$.
  • Figure 5: Comparison of achievable NCL and lower bound of all-to-all. The achievable NCL is obtained under the cyclic placement, while the lower bound is derived for arbitrary placement.
  • ...and 8 more figures

Theorems & Definitions (5)

  • proof
  • proof
  • proof
  • proof
  • proof