Table of Contents
Fetching ...

ORAP: Optimized Row Access Prefetching for Rowhammer-mitigated Memory

Maccoy Merrell, Daniel Puckett, Gino Chacon, Jeffrey Stuecheli, Stavros Kalafatis, Paul V. Gratz

TL;DR

The Optimized Row Access Prefetcher (ORAP) is proposed, leveraging last-level-cache (LLC) space to cache large portions of DRAM rowbuffer contents to reduce the need for future activations and reduce energy overheads.

Abstract

Rowhammer is a well-studied DRAM phenomenon wherein multiple activations to a given row can cause bit flips in adjacent rows. Many mitigation techniques have been introduced to address Rowhammer, with some support being incorporated into the JEDEC DDR5 standard for per-row-activation-counter (PRAC) and refresh-management (RFM) systems. Mitigation schemes built on these mechanisms claim to have various levels of area, power, and performance overheads. To date the evaluation of existing mitigation schemes typically neglects the impact of other memory system components such as hardware prefetchers. Nearly all modern systems incorporate hardware prefetching and these can significantly improve processor performance through speculative cache population. These prefetchers induce higher numbers of downstream memory requests and increase DRAM activation rates. The performance overhead of Rowhammer mitigations are tied directly to memory access patterns, exposing both hardware prefetchers and Rowhammer mitigations to cross-interaction. We find that the performance improvement provided by prior-work hardware prefetchers is often severely impacted by Rowhammer mitigations. In effect, much of the benefit of speculative memory references from prefetching lies in accelerating and reordering DRAM references in ways that trigger mitigations, significantly reducing the benefits of prefetching. This work proposes the Optimized Row Access Prefetcher (ORAP), leveraging last-level-cache (LLC) space to cache large portions of DRAM rowbuffer contents to reduce the need for future activations. Working with the state-of-the-art Berti prefetcher, ORAP reduces DRAM activation rates by 51.3% and achieves a 4.6% speedup over the prefetcher configuration of Berti and SPP-PPF when prefetching in an RFM-mitigated memory system. Under PRAC mitigations, ORAP reduces energy overheads by 11.8%.

ORAP: Optimized Row Access Prefetching for Rowhammer-mitigated Memory

TL;DR

The Optimized Row Access Prefetcher (ORAP) is proposed, leveraging last-level-cache (LLC) space to cache large portions of DRAM rowbuffer contents to reduce the need for future activations and reduce energy overheads.

Abstract

Rowhammer is a well-studied DRAM phenomenon wherein multiple activations to a given row can cause bit flips in adjacent rows. Many mitigation techniques have been introduced to address Rowhammer, with some support being incorporated into the JEDEC DDR5 standard for per-row-activation-counter (PRAC) and refresh-management (RFM) systems. Mitigation schemes built on these mechanisms claim to have various levels of area, power, and performance overheads. To date the evaluation of existing mitigation schemes typically neglects the impact of other memory system components such as hardware prefetchers. Nearly all modern systems incorporate hardware prefetching and these can significantly improve processor performance through speculative cache population. These prefetchers induce higher numbers of downstream memory requests and increase DRAM activation rates. The performance overhead of Rowhammer mitigations are tied directly to memory access patterns, exposing both hardware prefetchers and Rowhammer mitigations to cross-interaction. We find that the performance improvement provided by prior-work hardware prefetchers is often severely impacted by Rowhammer mitigations. In effect, much of the benefit of speculative memory references from prefetching lies in accelerating and reordering DRAM references in ways that trigger mitigations, significantly reducing the benefits of prefetching. This work proposes the Optimized Row Access Prefetcher (ORAP), leveraging last-level-cache (LLC) space to cache large portions of DRAM rowbuffer contents to reduce the need for future activations. Working with the state-of-the-art Berti prefetcher, ORAP reduces DRAM activation rates by 51.3% and achieves a 4.6% speedup over the prefetcher configuration of Berti and SPP-PPF when prefetching in an RFM-mitigated memory system. Under PRAC mitigations, ORAP reduces energy overheads by 11.8%.
Paper Structure (30 sections, 19 figures, 4 tables)

This paper contains 30 sections, 19 figures, 4 tables.

Figures (19)

  • Figure 1: Impact of Rowhammer mitigations on prefetcher single-core performance improvement (top) and DRAM dynamic energy per thousand instructions (bottom)
  • Figure 2: The hierarchical organization of a dual in-line memory module (DIMM)
  • Figure 3: 64 GiB DRAM logical-physical mappings
  • Figure 4: RFM and PRAC Architecture
  • Figure 5: Geomean increase in activations per thousand instructions (APKI) when adding prefetchers on single-core workloads.
  • ...and 14 more figures