Dependency-Aware Online Caching
Julien Dallot, Amirmehdi Jafari Fesharaki, Maciej Pacut, Stefan Schmid
TL;DR
This work addresses online caching where each item can be stored only if all its dependencies (as defined by a DAG) are present, a setting motivated by packet-classification rule caches in networks. It introduces Bucketing, a randomized online algorithm that achieves an $O(\log k)$-competitive ratio (tight to $2 H_{\min\{k,\ell\}}$) and a deterministic Recursive LRU, providing tight lower bounds and insights into how graph topology (via $\ell$, the maximum independent set size in the transitive closure) affects competitiveness. In the bypassing variant, the authors develop BucketingBypass, attaining a bound of $O(\sqrt{k \log k})$ (refined to $6 \sqrt{k \cdot H_{\min\{k,\ell\}}}$) and demonstrating significant practical improvements; they also present empirical evidence of ~2x cost reductions over TreeCaching on representative workloads. The results advance the theory of dependency-aware caching, with practical implications for router-rule caching and centralized-control caching in networks, and strengthen guarantees beyond prior approaches such as CacheFlow and SPAA tree-caching. The work combines rigorous competitive analysis with practical validation and provides a foundation for further extensions to weighted and more general caching models.
Abstract
We consider a variant of the online caching problem where the items exhibit dependencies among each other: an item can reside in the cache only if all its dependent items are also in the cache. The dependency relations can form any directed acyclic graph. These requirements arise e.g., in systems such as CacheFlow (SOSR 2016) that cache forwarding rules for packet classification in IP-based communication networks. First, we present an optimal randomized online caching algorithm which accounts for dependencies among the items. Our randomized algorithm is $O( \log k)$-competitive, where $k$ is the size of the cache, meaning that our algorithm never incurs the cost of $O(\log k)$ times higher than even an optimal algorithm that knows the future input sequence. Second, we consider the bypassing model, where requests can be served at a fixed price without fetching the item and its dependencies into the cache -- a variant of caching with dependencies introduced by Bienkowski et al. at SPAA 2017. For this setting, we give an $O( \sqrt{k \cdot \log k})$-competitive algorithm, which significantly improves the best known competitiveness. We conduct a small case study, to find out that our algorithm incurs on average 2x lower cost.
