Latency Guarantees for Caching with Delayed Hits

Keerthana Gurushankar; Noah G. Singer; Bernardo Subercaseaux

Latency Guarantees for Caching with Delayed Hits

Keerthana Gurushankar, Noah G. Singer, Bernardo Subercaseaux

TL;DR

This work analyzes caching with delayed hits, where misses incur latency governed by the fetch delay ratio $Z$ relative to inter-request time. It formulates $(Z,k)$-DelayedHitsCaching as a finite-state machine over $n$ pages with a $k$-page cache and analyzes online policies, proving a tight $Theta(Zk)$ competitiveness bound for deterministic marking-based policies, including LRU. The main contribution is a phase and superphase decomposition that upper-bounds online latency while lower-bounding the offline optimum, yielding a provable worst-case guarantee and guiding practical caching strategies. Empirical evaluation on 18 datasets shows that actual competitive ratios are typically far below the theoretical bound, and that latency scales with $Z$ in practice.

Abstract

In the classical caching problem, when a requested page is not present in the cache (i.e., a "miss"), it is assumed to travel from the backing store into the cache "before" the next request arrives. However, in many real-life applications, such as content delivery networks, this assumption is unrealistic. The "delayed-hits" model for caching, introduced by Atre, Sherry, Wang, and Berger, accounts for the latency between a missed cache request and the corresponding arrival from the backing store. This theoretical model has two parameters: the "delay" $Z$, representing the ratio between the retrieval delay and the inter-request delay in an application, and the "cache size" $k$, as in classical caching. Classical caching corresponds to $Z=1$, whereas larger values of $Z$ model applications where retrieving missed requests is expensive. Despite the practical relevance of the delayed-hits model, its theoretical underpinnings are still poorly understood. We present the first tight theoretical guarantee for optimizing delayed-hits caching: The "Least Recently Used" algorithm, a natural, deterministic, online algorithm widely used in practice, is $O(Zk)$-competitive, meaning it incurs at most $O(Zk)$ times more latency than the (offline) optimal schedule. Our result extends to any so-called "marking" algorithm.

Latency Guarantees for Caching with Delayed Hits

TL;DR

This work analyzes caching with delayed hits, where misses incur latency governed by the fetch delay ratio

relative to inter-request time. It formulates

-DelayedHitsCaching as a finite-state machine over

pages with a

-page cache and analyzes online policies, proving a tight

competitiveness bound for deterministic marking-based policies, including LRU. The main contribution is a phase and superphase decomposition that upper-bounds online latency while lower-bounding the offline optimum, yielding a provable worst-case guarantee and guiding practical caching strategies. Empirical evaluation on 18 datasets shows that actual competitive ratios are typically far below the theoretical bound, and that latency scales with

in practice.

Abstract

, representing the ratio between the retrieval delay and the inter-request delay in an application, and the "cache size"

, as in classical caching. Classical caching corresponds to

, whereas larger values of

model applications where retrieving missed requests is expensive. Despite the practical relevance of the delayed-hits model, its theoretical underpinnings are still poorly understood. We present the first tight theoretical guarantee for optimizing delayed-hits caching: The "Least Recently Used" algorithm, a natural, deterministic, online algorithm widely used in practice, is

-competitive, meaning it incurs at most

times more latency than the (offline) optimal schedule. Our result extends to any so-called "marking" algorithm.

Latency Guarantees for Caching with Delayed Hits

TL;DR

Abstract

Latency Guarantees for Caching with Delayed Hits

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (21)