Revisiting Cache Freshness for Emerging Real-Time Applications
Ziming Mao, Rishabh Iyer, Scott Shenker, Ion Stoica
TL;DR
The paper tackles the limitation of TTL-based caching for real-time freshness in modern applications. It introduces a bounded-staleness model with costs $C_F$ and $C_S$ and shows that reacting to writes with updates or invalidates—batched over a window $T$—can reduce overhead compared to TTL policies. An adaptive per-object policy is derived to decide between updates and invalidates based on per-object costs $c_u$, $c_i$, $c_m$ and read/write probabilities, and practical realization uses Top-K and Count-min sketches to estimate $E[W]$. Simulations on multiple workloads demonstrate substantial gains in freshness-related throughput and latency, while the paper also outlines open challenges around reliable delivery, many-to-many dependencies, and eviction integration.
Abstract
Caching is widely used in industry to improve application performance by reducing data-access latency and taking the load off the backend infrastructure. TTLs have become the de-facto mechanism used to keep cached data reasonably fresh (i.e., not too out of date with the backend). However, the emergence of real-time applications requires tighter data freshness, which is impractical to achieve with TTLs. We discuss why this is the case, and propose a simple yet effective adaptive policy to achieve the desired freshness.
