Table of Contents
Fetching ...

On Existence of Latency Optimal Uncoded Storage Schemes in Geo-Distributed Data Storage Systems

Srivathsa Acharya, P. Vijay Kumar, Viveck R. Cadambe

TL;DR

This paper tackles latency optimization in geo-distributed storage by modeling the network as a weighted complete graph with RTTs $\tau(i,j)$ and analyzing two latency metrics: per-node worst-case latency and system-average latency. The authors introduce the nearest-neighbor graph $\mathcal{G}_{k-1}$ and its extended graph $\mathcal{H}$, and prove a fundamental result: an admissible uncoded storage scheme on $\mathcal{G}_{k-1}$ exists if and only if $\chi(\mathcal{H})=k$, linking data placement to vertex coloring. They further show that MDS codes always yield admissible schemes on $\mathcal{G}_{k-1}$, but their average latency may incur a code-penalty $\Delta(\mathcal{C})$, while providing an efficient $\Theta(nk)$ algorithm to select optimal parity placements. In cases where uncoded optimality is impossible ($\chi(\mathcal{H})=k+1$), the paper proposes a family of admissible binary codes and analyzes a binary encoding strategy, illustrated on a six-node AWS-like network. The results offer a graph-theoretic foundation for data placement decisions in geo-distributed storage and outline open problems for designing optimal coded schemes in non-uncoded-optimal networks.

Abstract

We consider the problem of geographically distributed data storage in a network of servers (or nodes) where the nodes are connected to each other via communication links having certain round-trip times (RTTs). Each node serves a specific set of clients, where a client can request for any of the files available in the distributed system. The parent node provides the requested file if available locally; else it contacts other nodes that have the data needed to retrieve the requested file. This inter-node communication incurs a delay resulting in a certain latency in servicing the data request. The worst-case latency incurred at a servicing node and the system average latency are important performance metrics of a storage system, which depend not only on inter-node RTTs, but also on how the data is stored across the nodes. Data files could be placed in the nodes as they are, i.e., in uncoded fashion, or can be coded and placed. This paper provides the necessary and sufficient conditions for the existence of uncoded storage schemes that are optimal in terms of both per-node worst-case latency and system average latency. In addition, the paper provides efficient binary storage codes for a specific case where optimal uncoded schemes do not exist.

On Existence of Latency Optimal Uncoded Storage Schemes in Geo-Distributed Data Storage Systems

TL;DR

This paper tackles latency optimization in geo-distributed storage by modeling the network as a weighted complete graph with RTTs and analyzing two latency metrics: per-node worst-case latency and system-average latency. The authors introduce the nearest-neighbor graph and its extended graph , and prove a fundamental result: an admissible uncoded storage scheme on exists if and only if , linking data placement to vertex coloring. They further show that MDS codes always yield admissible schemes on , but their average latency may incur a code-penalty , while providing an efficient algorithm to select optimal parity placements. In cases where uncoded optimality is impossible (), the paper proposes a family of admissible binary codes and analyzes a binary encoding strategy, illustrated on a six-node AWS-like network. The results offer a graph-theoretic foundation for data placement decisions in geo-distributed storage and outline open problems for designing optimal coded schemes in non-uncoded-optimal networks.

Abstract

We consider the problem of geographically distributed data storage in a network of servers (or nodes) where the nodes are connected to each other via communication links having certain round-trip times (RTTs). Each node serves a specific set of clients, where a client can request for any of the files available in the distributed system. The parent node provides the requested file if available locally; else it contacts other nodes that have the data needed to retrieve the requested file. This inter-node communication incurs a delay resulting in a certain latency in servicing the data request. The worst-case latency incurred at a servicing node and the system average latency are important performance metrics of a storage system, which depend not only on inter-node RTTs, but also on how the data is stored across the nodes. Data files could be placed in the nodes as they are, i.e., in uncoded fashion, or can be coded and placed. This paper provides the necessary and sufficient conditions for the existence of uncoded storage schemes that are optimal in terms of both per-node worst-case latency and system average latency. In addition, the paper provides efficient binary storage codes for a specific case where optimal uncoded schemes do not exist.
Paper Structure (14 sections, 6 theorems, 13 equations, 6 figures, 2 tables)

This paper contains 14 sections, 6 theorems, 13 equations, 6 figures, 2 tables.

Key Result

Proposition 1

For any code $\mathcal{C}$ on $\mathcal{G}$, per-node worst-case latency at any node $i$ is lower-bounded as: Further, the average latency $L_{avg}(\mathcal{C})$ is lower-bounded as:

Figures (6)

  • Figure 1: Example 1: Data store with $4$ nodes and $3$ files with inter-node RTTs. Storage type - Left: Uncoded, Right: Coded.
  • Figure 2: Example 2: Data store with $n$ = 4 nodes $\{A,B,C,D\}$, $k$ =3 files $\{W_1, W_2, W_3\}$, and their inter-node RTTs.
  • Figure 3: Nearest-neighbor graphs $\mathcal{G}_{2}$ for Section \ref{['sec:intro']} examples.
  • Figure 4: Extended graphs for the examples in Section \ref{['sec:intro']}.The dashed edges are those added on top of $\mathcal{G}_{2}$.
  • Figure 5: A sample data store with $6$ nodes and their inter-node RTTs (in ms) measured as per AWS public cloudcadambeaws.
  • ...and 1 more figures

Theorems & Definitions (24)

  • Example 1
  • Example 2
  • Definition 1: Linear storage code
  • Proposition 1
  • proof
  • Definition 2
  • Definition 3
  • Remark 1
  • Example 3: Scalar MDS Codes
  • Proposition 2
  • ...and 14 more