Table of Contents
Fetching ...

Dependency-Aware Online Caching

Julien Dallot, Amirmehdi Jafari Fesharaki, Maciej Pacut, Stefan Schmid

TL;DR

This work addresses online caching where each item can be stored only if all its dependencies (as defined by a DAG) are present, a setting motivated by packet-classification rule caches in networks. It introduces Bucketing, a randomized online algorithm that achieves an $O(\log k)$-competitive ratio (tight to $2 H_{\min\{k,\ell\}}$) and a deterministic Recursive LRU, providing tight lower bounds and insights into how graph topology (via $\ell$, the maximum independent set size in the transitive closure) affects competitiveness. In the bypassing variant, the authors develop BucketingBypass, attaining a bound of $O(\sqrt{k \log k})$ (refined to $6 \sqrt{k \cdot H_{\min\{k,\ell\}}}$) and demonstrating significant practical improvements; they also present empirical evidence of ~2x cost reductions over TreeCaching on representative workloads. The results advance the theory of dependency-aware caching, with practical implications for router-rule caching and centralized-control caching in networks, and strengthen guarantees beyond prior approaches such as CacheFlow and SPAA tree-caching. The work combines rigorous competitive analysis with practical validation and provides a foundation for further extensions to weighted and more general caching models.

Abstract

We consider a variant of the online caching problem where the items exhibit dependencies among each other: an item can reside in the cache only if all its dependent items are also in the cache. The dependency relations can form any directed acyclic graph. These requirements arise e.g., in systems such as CacheFlow (SOSR 2016) that cache forwarding rules for packet classification in IP-based communication networks. First, we present an optimal randomized online caching algorithm which accounts for dependencies among the items. Our randomized algorithm is $O( \log k)$-competitive, where $k$ is the size of the cache, meaning that our algorithm never incurs the cost of $O(\log k)$ times higher than even an optimal algorithm that knows the future input sequence. Second, we consider the bypassing model, where requests can be served at a fixed price without fetching the item and its dependencies into the cache -- a variant of caching with dependencies introduced by Bienkowski et al. at SPAA 2017. For this setting, we give an $O( \sqrt{k \cdot \log k})$-competitive algorithm, which significantly improves the best known competitiveness. We conduct a small case study, to find out that our algorithm incurs on average 2x lower cost.

Dependency-Aware Online Caching

TL;DR

This work addresses online caching where each item can be stored only if all its dependencies (as defined by a DAG) are present, a setting motivated by packet-classification rule caches in networks. It introduces Bucketing, a randomized online algorithm that achieves an -competitive ratio (tight to ) and a deterministic Recursive LRU, providing tight lower bounds and insights into how graph topology (via , the maximum independent set size in the transitive closure) affects competitiveness. In the bypassing variant, the authors develop BucketingBypass, attaining a bound of (refined to ) and demonstrating significant practical improvements; they also present empirical evidence of ~2x cost reductions over TreeCaching on representative workloads. The results advance the theory of dependency-aware caching, with practical implications for router-rule caching and centralized-control caching in networks, and strengthen guarantees beyond prior approaches such as CacheFlow and SPAA tree-caching. The work combines rigorous competitive analysis with practical validation and provides a foundation for further extensions to weighted and more general caching models.

Abstract

We consider a variant of the online caching problem where the items exhibit dependencies among each other: an item can reside in the cache only if all its dependent items are also in the cache. The dependency relations can form any directed acyclic graph. These requirements arise e.g., in systems such as CacheFlow (SOSR 2016) that cache forwarding rules for packet classification in IP-based communication networks. First, we present an optimal randomized online caching algorithm which accounts for dependencies among the items. Our randomized algorithm is -competitive, where is the size of the cache, meaning that our algorithm never incurs the cost of times higher than even an optimal algorithm that knows the future input sequence. Second, we consider the bypassing model, where requests can be served at a fixed price without fetching the item and its dependencies into the cache -- a variant of caching with dependencies introduced by Bienkowski et al. at SPAA 2017. For this setting, we give an -competitive algorithm, which significantly improves the best known competitiveness. We conduct a small case study, to find out that our algorithm incurs on average 2x lower cost.
Paper Structure (23 sections, 11 theorems, 16 equations, 3 figures, 2 algorithms)

This paper contains 23 sections, 11 theorems, 16 equations, 3 figures, 2 algorithms.

Key Result

Lemma 2

$\forall i \in [0, m-1]$, $\forall b \in [i+1, m]$ it holds that

Figures (3)

  • Figure 1: Example directed acyclic graph of dependencies among items. An arrow from $u$ to $v$ means that if $u$ is in the cache then also $v$ must be in the cache. The size of the cache is $k = 8$, the items $9$, $10$ and $11$ are not in the cache while the grayed items $1,2, \ldots, 8$ are in the cache. If a request to $9$ would arrive, we must choose what item to evict while keeping a feasible cache ($8$ or $7$ are the only choices here) before we fetch $9$.
  • Figure 2: Example of buckets formed at the beginning of a phase. We have three buckets, depicted by dashed sets, with maximum items (i.e., candidates for eviction) $6$, $8$ and $7$. The item $1$ is in two buckets.
  • Figure 3: Comparison of the BucketingBypass and Tree Caching algorithms in terms of cost per request for various cache sizes $k$ and a binary tree with height $h(T)=10$ and $1023$ nodes. The left subfigure plot uses Zipf distribution parameterized by $a=4$ and the right subfigure uses geometric distribution parameterized by $p = \frac{10}{2^{10}}$.

Theorems & Definitions (22)

  • Definition 1
  • Lemma 2
  • proof
  • Lemma 3
  • proof
  • Lemma 4
  • proof
  • Theorem 5
  • proof
  • Theorem 6
  • ...and 12 more