Table of Contents
Fetching ...

Cost-Driven Data Replication with Predictions

Tianyu Zuo, Xueyan Tang, Bu Sung Lee

TL;DR

The paper tackles cost-driven online data replication across geo-distributed servers under learning-augmented predictions. It designs an online replication algorithm whose trust in predictions is controlled by a hyper-parameter $α∈(0,1]$ to balance consistency and robustness. The authors prove a $(\frac{5+α}{3})$-consistent bound under perfect predictions and a $(1+\frac{1}{α})$-robust bound under arbitrary mispredictions, and establish a $\frac{3}{2}$ lower bound on consistency for deterministic learners, while also proposing a misprediction-aware adaptation. Experiments on real IBM traces validate the theory, showing improvements when predictions are accurate and bounded robustness when they are not.

Abstract

This paper studies an online replication problem for distributed data access. The goal is to dynamically create and delete data copies in a multi-server system as time passes to minimize the total storage and network cost of serving access requests. We study the problem in the emergent learning-augmented setting, assuming simple binary predictions about inter-request times at individual servers. We develop an online algorithm and prove that it is ($\frac{5+α}{3}$)-consistent (competitiveness under perfect predictions) and ($1 + \frac{1}α$)-robust (competitiveness under terrible predictions), where $α\in (0, 1]$ is a hyper-parameter representing the level of distrust in the predictions. We also study the impact of mispredictions on the competitive ratio of the proposed algorithm and adapt it to achieve a bounded robustness while retaining its consistency. We further establish a lower bound of $\frac{3}{2}$ on the consistency of any deterministic learning-augmented algorithm. Experimental evaluations are carried out to evaluate our algorithms using real data access traces.

Cost-Driven Data Replication with Predictions

TL;DR

The paper tackles cost-driven online data replication across geo-distributed servers under learning-augmented predictions. It designs an online replication algorithm whose trust in predictions is controlled by a hyper-parameter to balance consistency and robustness. The authors prove a -consistent bound under perfect predictions and a -robust bound under arbitrary mispredictions, and establish a lower bound on consistency for deterministic learners, while also proposing a misprediction-aware adaptation. Experiments on real IBM traces validate the theory, showing improvements when predictions are accurate and bounded robustness when they are not.

Abstract

This paper studies an online replication problem for distributed data access. The goal is to dynamically create and delete data copies in a multi-server system as time passes to minimize the total storage and network cost of serving access requests. We study the problem in the emergent learning-augmented setting, assuming simple binary predictions about inter-request times at individual servers. We develop an online algorithm and prove that it is ()-consistent (competitiveness under perfect predictions) and ()-robust (competitiveness under terrible predictions), where is a hyper-parameter representing the level of distrust in the predictions. We also study the impact of mispredictions on the competitive ratio of the proposed algorithm and adapt it to achieve a bounded robustness while retaining its consistency. We further establish a lower bound of on the consistency of any deterministic learning-augmented algorithm. Experimental evaluations are carried out to evaluate our algorithms using real data access traces.
Paper Structure (25 sections, 11 theorems, 13 equations, 32 figures, 1 algorithm)

This paper contains 25 sections, 11 theorems, 13 equations, 32 figures, 1 algorithm.

Key Result

Proposition 1

The storage periods of any two special copies do not overlap. Moreover, the storage period of any special copy does not overlap with that of any regular copy.

Figures (32)

  • Figure 1: An example of our online algorithm
  • Figure 2: Different request types in our online algorithm
  • Figure 3: Illustration of a data copy at $s$ crossing time $t_{p(e)}$
  • Figure 4: Illustration of Case A and Case B
  • Figure 5: A tight example for robustness analysis
  • ...and 27 more figures

Theorems & Definitions (20)

  • Proposition 1
  • Proposition 2
  • Proposition 3
  • Proposition 4
  • Proposition 5
  • Proposition 6
  • Proposition 7
  • Proposition 8
  • proof
  • Lemma 1
  • ...and 10 more