Table of Contents
Fetching ...

On Competitiveness of Dynamic Replication for Distributed Data Access

Tianyu Zuo, Xueyan Tang, Bu Sung Lee, Jianfei Cai

TL;DR

This work studies online dynamic replication for distributed data access, aiming to minimize total storage and transfer costs across geo-distributed servers with heterogeneous storage rates. It refutes prior claims of a $2$-competitive online algorithm, proves a universal lower bound $>2$ for deterministic online strategies, and presents a new online algorithm with a tight competitiveness bound of $\max\{2, \min\{\gamma,3\}\}$, where $\gamma$ is the max/min storage-rate ratio across servers. The analysis uses a novel induction-based cost allocation and classifies operations into regular and special copies to bound the online versus offline performance; the bounds are shown to be tight via explicit counterexamples and matching constructions. Empirical evaluation on real traces corroborates the theoretical findings and demonstrates practical gains over existing online approaches in typical transfer-cost regimes, offering rigorous guarantees for online replication in pay-as-you-go cloud and edge environments.

Abstract

This paper studies an online cost optimization problem for distributed storage and access. The goal is to dynamically create and delete copies of data objects over time at geo-distributed servers to serve access requests and minimize the total storage and network cost. We revisit a recent algorithm in the literature and show that it does not have a competitive ratio of $2$ as claimed by constructing a counterexample. We further prove that no deterministic online algorithm can achieve a competitive ratio bounded by $2$ for the general cost optimization problem. We develop an online algorithm and prove that it achieves a competitive ratio of $\max\{2, \min\{γ, 3\}\}$, where $γ$ is the max/min storage cost ratio among all servers. Examples are given to confirm the tightness of competitive analysis. We also empirically evaluate algorithms using real object access traces.

On Competitiveness of Dynamic Replication for Distributed Data Access

TL;DR

This work studies online dynamic replication for distributed data access, aiming to minimize total storage and transfer costs across geo-distributed servers with heterogeneous storage rates. It refutes prior claims of a -competitive online algorithm, proves a universal lower bound for deterministic online strategies, and presents a new online algorithm with a tight competitiveness bound of , where is the max/min storage-rate ratio across servers. The analysis uses a novel induction-based cost allocation and classifies operations into regular and special copies to bound the online versus offline performance; the bounds are shown to be tight via explicit counterexamples and matching constructions. Empirical evaluation on real traces corroborates the theoretical findings and demonstrates practical gains over existing online approaches in typical transfer-cost regimes, offering rigorous guarantees for online replication in pay-as-you-go cloud and edge environments.

Abstract

This paper studies an online cost optimization problem for distributed storage and access. The goal is to dynamically create and delete copies of data objects over time at geo-distributed servers to serve access requests and minimize the total storage and network cost. We revisit a recent algorithm in the literature and show that it does not have a competitive ratio of as claimed by constructing a counterexample. We further prove that no deterministic online algorithm can achieve a competitive ratio bounded by for the general cost optimization problem. We develop an online algorithm and prove that it achieves a competitive ratio of , where is the max/min storage cost ratio among all servers. Examples are given to confirm the tightness of competitive analysis. We also empirically evaluate algorithms using real object access traces.

Paper Structure

This paper contains 17 sections, 10 theorems, 34 equations, 25 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

For each deterministic online algorithm, there exists an instance of the cost optimization problem such that the cost of the online solution is more than $2$ times the cost of an optimal offline solution.

Figures (25)

  • Figure 1: A counterexample
  • Figure 2: Another counterexample
  • Figure 3: An example of our algorithm
  • Figure 4: Illustration of different request types in our online algorithm
  • Figure 5: Four types of requests in an optimal offline strategy
  • ...and 20 more figures

Theorems & Definitions (18)

  • Theorem 1
  • proof
  • Proposition 1
  • Proposition 2
  • proof
  • Proposition 3
  • Proposition 4
  • proof
  • Proposition 5
  • Proposition 6
  • ...and 8 more