Per-Row Activation Counting on Real Hardware: Demystifying Performance Overheads
Jumin Kim, Seungmin Baek, Minbok Wi, Hwayong Nam, Michael Jaemin Kim, Sukhan Lee, Kyomin Sohn, Jung Ho Ahn
TL;DR
This work addresses the gap between simulator-based PRAC assessments and real hardware performance. It implements PRAC timing changes on current Intel CPUs and validates them with microbenchmarks before measuring SPEC CPU2017 workloads. The key findings show that the average PRAC overhead on real hardware is only $1.06\%$ with a peak of $3.28\%$, which is up to $9.15\times$ lower than prior simulator reports, and that overhead scales with RBMPKI. Importantly, memory-page policy—especially a close policy—significantly mitigates overhead by reducing row-buffer misses, improving overall efficiency and indicating PRAC is practical with proper controller policies.
Abstract
Per-Row Activation Counting (PRAC), a DRAM read disturbance mitigation method, modifies key DRAM timing parameters, reportedly causing significant performance overheads in simulator-based studies. However, given known discrepancies between simulators and real hardware, real-machine experiments are vital for accurate PRAC performance estimation. We present the first real-machine performance analysis of PRAC. After verifying timing modifications on the latest CPUs using microbenchmarks, our analysis shows that PRAC's average and maximum overheads are just 1.06% and 3.28% for the SPEC CPU2017 workloads -- up to 9.15x lower than simulator-based reports. Further, we show that the close page policy minimizes this overhead by effectively hiding the elongated DRAM row precharge operations due to PRAC from the critical path.
