On Competitiveness of Dynamic Replication for Distributed Data Access
Tianyu Zuo, Xueyan Tang, Bu Sung Lee, Jianfei Cai
TL;DR
This work studies online dynamic replication for distributed data access, aiming to minimize total storage and transfer costs across geo-distributed servers with heterogeneous storage rates. It refutes prior claims of a $2$-competitive online algorithm, proves a universal lower bound $>2$ for deterministic online strategies, and presents a new online algorithm with a tight competitiveness bound of $\max\{2, \min\{\gamma,3\}\}$, where $\gamma$ is the max/min storage-rate ratio across servers. The analysis uses a novel induction-based cost allocation and classifies operations into regular and special copies to bound the online versus offline performance; the bounds are shown to be tight via explicit counterexamples and matching constructions. Empirical evaluation on real traces corroborates the theoretical findings and demonstrates practical gains over existing online approaches in typical transfer-cost regimes, offering rigorous guarantees for online replication in pay-as-you-go cloud and edge environments.
Abstract
This paper studies an online cost optimization problem for distributed storage and access. The goal is to dynamically create and delete copies of data objects over time at geo-distributed servers to serve access requests and minimize the total storage and network cost. We revisit a recent algorithm in the literature and show that it does not have a competitive ratio of $2$ as claimed by constructing a counterexample. We further prove that no deterministic online algorithm can achieve a competitive ratio bounded by $2$ for the general cost optimization problem. We develop an online algorithm and prove that it achieves a competitive ratio of $\max\{2, \min\{γ, 3\}\}$, where $γ$ is the max/min storage cost ratio among all servers. Examples are given to confirm the tightness of competitive analysis. We also empirically evaluate algorithms using real object access traces.
