Efficient and Optimal No-Regret Caching under Partial Observation
Younes Ben Mazziane, Francescomaria Faticanti, Sara Alouf, Giovanni Neglia
TL;DR
This work addresses caching under partial observability, where each request is observed with probability $p$, by proposing NFPL, a Follow-the-Perturbed-Leader based caching policy that uses batched updates and noisy per-file counts. NFPL achieves asymptotically optimal sublinear regret with a bound $\mathcal{R}_T(\texttt{NFPL}) \le \dfrac{2\sqrt{2BC}}{pq} \left(\sqrt{T} + \dfrac{B}{2\sqrt{T}}\right)$ and supports $O(1)$ amortized time via three variants (S-NFPL, D-NFPL, L-NFPL) that differ in noise correlation. The paper provides rigorous regret and time-complexity guarantees, and validates the approach on synthetic Zipf-like traces and real Akamai data, showing NFPL often outperforms classical policies like LFU and LRU, especially under low observability or adversarial patterns. Practically, NFPL enables efficient, scalable no-regret caching at the network edge or CDN caches where full history is unavailable or costly to maintain, contributing to robust content delivery in dynamic, partially observed environments.
Abstract
Online learning algorithms have been successfully used to design caching policies with sublinear regret in the total number of requests, with no statistical assumption about the request sequence. Most existing algorithms involve computationally expensive operations and require knowledge of all past requests. However, this may not be feasible in practical scenarios like caching at a cellular base station. Therefore, we study the caching problem in a more restrictive setting where only a fraction of past requests are observed, and we propose a randomized caching policy with sublinear regret based on the classic online learning algorithm Follow-the-Perturbed-Leader (FPL). Our caching policy is the first to attain the asymptotically optimal regret bound while ensuring asymptotically constant amortized time complexity in the partial observability setting of requests. The experimental evaluation compares the proposed solution against classic caching policies and validates the proposed approach under synthetic and real-world request traces.
