Distance based prefetching algorithms for mining of the sporadic requests associations
Vadim Voevodkin, Andrey Sokolov
TL;DR
The paper tackles reducing storage latency by improving prefetching of sporadic read requests. It introduces the Distance Based Sporadic Prefetcher (DBSP), a lightweight algorithm that uses distances between request histories and three tables to identify associations. The authors provide a rigorous evaluation methodology and demonstrate that DBSP outperforms the Mithril baseline with modest increases in cache hit ratio and acceptable storage overhead. The work offers practical integration guidance and a framework for consistent comparison of sporadic prefetchers across storage systems.
Abstract
Modern storage systems intensively utilize data prefetching algorithms while processing sequences of the read requests. Performance of the prefetching algorithm (for instance increase of the cache hit ratio of the cache system - CHR) directly affects overall performance characteristics of the storage system (read latency, IOPS, etc.). There are widely known prefetching algorithms that are focused on the discovery of the sequential patterns in the stream of requests. This study examines a family of prefetching algorithms that is focused on mining of the pseudo random (sporadic) patterns between read requests - sporadic prefetching algorithms. The key contribution of this paper is that it discovers a new, lightweight family of distance-based sporadic prefetching algorithms (DBSP) that outperforms the best previously known results on MSR traces collection.Another important contribution of this paper is a thorough description of the procedure for comparing the performance of sporadic prefetchers.
