Improved Prefetching Techniques for Linked Data Structures
Nikola Vuk Maruszewski
TL;DR
Linked data structures (LDSs) challenge traditional memory-prefetching due to scattered nodes and irregular access patterns. The authors propose Linkey, a hybrid hardware-software prefetcher that uses lightweight software-provided metadata to configure an Address Table (AT), a Child Association Table (CAT), and a Backup Fetch Queue (BFQ), enabling timely, accurate prefetches without speculative pointer detection. Across traversal and lookup benchmarks, Linkey delivers a geometric mean miss-rate reduction of $13 ext{%}$ (up to $58.8 ext{%}$) and a geometric mean accuracy improvement of $65.4 ext{%}$, with IPC gains up to $12.1 ext{%}$ on applicable workloads. This work demonstrates how modest programmer/compiler hints, coupled with hardware tables and a flexible fetch pipeline, can substantially reduce memory stalls for pointer-chasing patterns, offering a practical path toward improved performance in LDS-heavy applications.
Abstract
With ever-increasing main memory stall times, we need novel techniques to reduce effective memory access latencies. Prefetching has been shown to be an effective solution, especially with contiguous data structures that follow the traditional principles of spatial and temporal locality. However, on linked data structures$-$made up of many nodes linked together with pointers$-$typical prefetchers struggle, failing to predict accesses as elements are arbitrarily scattered throughout memory and access patters are arbitrarily complex and hence difficult to predict. To remedy these issues, we introduce $\textit{Linkey}$, a novel prefetcher that utilizes hints from the programmer/compiler to cache layout information and accurately prefetch linked data structures. $\textit{Linkey}$ obtains substantial performance improvements over a striding baseline. We achieve a geomean 13% reduction in miss rate with a maximum improvement of 58.8%, and a 65.4% geomean increase in accuracy, with many benchmarks improving from 0%. On benchmarks where $\textit{Linkey}$ is applicable, we observe a geomean IPC improvement of 1.40%, up to 12.1%.
