Table of Contents
Fetching ...

Latency Guarantees for Caching with Delayed Hits

Keerthana Gurushankar, Noah G. Singer, Bernardo Subercaseaux

TL;DR

This work analyzes caching with delayed hits, where misses incur latency governed by the fetch delay ratio $Z$ relative to inter-request time. It formulates $(Z,k)$-DelayedHitsCaching as a finite-state machine over $n$ pages with a $k$-page cache and analyzes online policies, proving a tight $Theta(Zk)$ competitiveness bound for deterministic marking-based policies, including LRU. The main contribution is a phase and superphase decomposition that upper-bounds online latency while lower-bounding the offline optimum, yielding a provable worst-case guarantee and guiding practical caching strategies. Empirical evaluation on 18 datasets shows that actual competitive ratios are typically far below the theoretical bound, and that latency scales with $Z$ in practice.

Abstract

In the classical caching problem, when a requested page is not present in the cache (i.e., a "miss"), it is assumed to travel from the backing store into the cache "before" the next request arrives. However, in many real-life applications, such as content delivery networks, this assumption is unrealistic. The "delayed-hits" model for caching, introduced by Atre, Sherry, Wang, and Berger, accounts for the latency between a missed cache request and the corresponding arrival from the backing store. This theoretical model has two parameters: the "delay" $Z$, representing the ratio between the retrieval delay and the inter-request delay in an application, and the "cache size" $k$, as in classical caching. Classical caching corresponds to $Z=1$, whereas larger values of $Z$ model applications where retrieving missed requests is expensive. Despite the practical relevance of the delayed-hits model, its theoretical underpinnings are still poorly understood. We present the first tight theoretical guarantee for optimizing delayed-hits caching: The "Least Recently Used" algorithm, a natural, deterministic, online algorithm widely used in practice, is $O(Zk)$-competitive, meaning it incurs at most $O(Zk)$ times more latency than the (offline) optimal schedule. Our result extends to any so-called "marking" algorithm.

Latency Guarantees for Caching with Delayed Hits

TL;DR

This work analyzes caching with delayed hits, where misses incur latency governed by the fetch delay ratio relative to inter-request time. It formulates -DelayedHitsCaching as a finite-state machine over pages with a -page cache and analyzes online policies, proving a tight competitiveness bound for deterministic marking-based policies, including LRU. The main contribution is a phase and superphase decomposition that upper-bounds online latency while lower-bounding the offline optimum, yielding a provable worst-case guarantee and guiding practical caching strategies. Empirical evaluation on 18 datasets shows that actual competitive ratios are typically far below the theoretical bound, and that latency scales with in practice.

Abstract

In the classical caching problem, when a requested page is not present in the cache (i.e., a "miss"), it is assumed to travel from the backing store into the cache "before" the next request arrives. However, in many real-life applications, such as content delivery networks, this assumption is unrealistic. The "delayed-hits" model for caching, introduced by Atre, Sherry, Wang, and Berger, accounts for the latency between a missed cache request and the corresponding arrival from the backing store. This theoretical model has two parameters: the "delay" , representing the ratio between the retrieval delay and the inter-request delay in an application, and the "cache size" , as in classical caching. Classical caching corresponds to , whereas larger values of model applications where retrieving missed requests is expensive. Despite the practical relevance of the delayed-hits model, its theoretical underpinnings are still poorly understood. We present the first tight theoretical guarantee for optimizing delayed-hits caching: The "Least Recently Used" algorithm, a natural, deterministic, online algorithm widely used in practice, is -competitive, meaning it incurs at most times more latency than the (offline) optimal schedule. Our result extends to any so-called "marking" algorithm.

Paper Structure

This paper contains 17 sections, 9 theorems, 13 equations, 5 figures, 2 tables.

Key Result

Proposition 2

For every $t \in [T]$, there can be at most one index $i \in [Z-1]$ such that $\sigma^{(t-i)} = {\texttt{Miss}}$ and $r_{t-i} = r_{t}$.

Figures (5)

  • Figure 1: Comparison of the classical caching model and the delayed hits model. Each request has its arrival time underneath and a symbol above it representing whether it was a hit (✓), a miss (✗), or a delayed hit (✓ + ✗). The cache starts out containing $1,2,3$, and in both cases the first request for $4$ is a miss. In both cases, the caching policy decides to evict page $3$ in order to cache page $4$, but the resulting latencies are different as described next. In standard caching, that decision results in a miss for page $3$ at time $4$, and a total latency of $2$. In the delayed hits model, however, page $4$ arrives from the store at time $6$, resulting in a latency of $3$ for the request at $t=3$, and a delayed hit for the request at time $5$ with a latency of $1$, accruing a total of latency $4$.
  • Figure 2: Total latency for LRU as a function of the delay $Z$, with $k = 5$ over all datasets.
  • Figure 3: Empirical competitive ratio of LRU (as a function of $Z$) over a WikiBench dataset. Results for other datasets were similar.
  • Figure 4: Empirical competitive ratio of LRU (as a function of $k$) over a WikiBench dataset. Results for other datasets were similar.
  • Figure 5: Empirical competitive ratio as a function of $Zk$. Only a few datasets are displayed to avoid cluttering, but the observed behavior was consistent across all datasets.

Theorems & Definitions (21)

  • Definition 1: Competitive ratio
  • Proposition 2
  • proof
  • Proposition 3
  • proof
  • Definition 4: $(Z,k)\textsc{-DelayedHitsCaching}$
  • Definition 5: Optimal policy
  • Definition 6: LRU policy
  • Theorem 7
  • Lemma 8: $\mathbf{e}_{\texttt{LRU}}$ superphase upper bound
  • ...and 11 more