ARC: DVFS-Aware Asymmetric-Retention STT-RAM Caches for Energy-Efficient Multicore Processors

Dhruv Gajaria; Tosiron Adegbija

ARC: DVFS-Aware Asymmetric-Retention STT-RAM Caches for Energy-Efficient Multicore Processors

Dhruv Gajaria, Tosiron Adegbija

TL;DR

This work investigates how dynamic voltage and frequency scaling (DVFS) interacts with relaxed-retention STT-RAM caches in multicore processors. It shows that clock frequency and retention time jointly influence cache performance, including expiration misses, and that naive retention choices can be suboptimal under DVFS. To exploit this, the authors design ARC, a DVFS-aware asymmetric-retention core architecture with cores tuned to different retention times and frequency ranges, plus a runtime decision-tree predictor to map applications to the best core. Empirical results demonstrate meaningful energy reductions at both cache and processor levels (up to ~39% cache energy and ~13% processor energy savings) compared to SRAM, and competitive gains versus homogeneous STT-RAM designs, with manageable overheads and scalable potential. The work highlights the potential of retention-time specialization combined with DVFS for energy-efficient multicore STT-RAM caches and suggests avenues for extending ARC to larger, more complex systems.

Abstract

Relaxed retention (or volatile) spin-transfer torque RAM (STT-RAM) has been widely studied as a way to reduce STT-RAM's write energy and latency overheads. Given a relaxed retention time STT-RAM level one (L1) cache, we analyze the impacts of dynamic voltage and frequency scaling (DVFS) -- a common optimization in modern processors -- on STT-RAM L1 cache design. Our analysis reveals that, apart from the fact that different applications may require different retention times, the clock frequency, which is typically ignored in most STT-RAM studies, may also significantly impact applications' retention time needs. Based on our findings, we propose an asymmetric-retention core (ARC) design for multicore architectures. ARC features retention time heterogeneity to specialize STT-RAM retention times to applications' needs. We also propose a runtime prediction model to determine the best core on which to run an application, based on the applications' characteristics, their retention time requirements, and available DVFS settings. Results reveal that the proposed approach can reduce the average cache energy by 20.19% and overall processor energy by 7.66%, compared to a homogeneous STT-RAM cache design.

ARC: DVFS-Aware Asymmetric-Retention STT-RAM Caches for Energy-Efficient Multicore Processors

TL;DR

Abstract

Paper Structure (22 sections, 1 equation, 13 figures, 3 tables)

This paper contains 22 sections, 1 equation, 13 figures, 3 tables.

Introduction
Background and Related Work
Overview of STT-RAM Caches
Mitigating the Overheads of Relaxed Retention Time STT-RAM
Dynamic Voltage Frequency Scaling
Heterogeneous Multicore Architectures
Interplay of DVFS and STT-RAM Caches
Impact of Variable Clock Frequency on STT-RAM Caches
Impact of Frequency on Expiration Misses
DVFS-Aware Asymmetric-Retention Core (ARC)
Proposed ARC Architecture
Overview of ARC Prediction Approach
ARC Prediction Model
Scalability
Experimental Setup
...and 7 more sections

Figures (13)

Figure 1: STT-RAM cell structure. High resistance state is in anti-parallel state and low resistance state is parallel state
Figure 2: Impact of frequency scaling to performance and processor energy compared to SRAMs. SRAM caches are faster than STT-RAM caches but consume high energy
Figure 3: Illustration of expiration misses. Assume that blocks A and B are in the same memory location, i.e., a write from one block would evict the currently resident block
Figure 4: Change in miss rate with respect to frequency. The change in miss rates is observed due to decrease in expiration misses with increase in frequency
Figure 5: Decrease in cache miss rate with increase in frequency from 0.8GHz to 2.0GHz for various retention times. We observe specific retention times having high change in cache miss rates with respect to frequency due to variance in cache block lifetimes for different benchmarks
...and 8 more figures

ARC: DVFS-Aware Asymmetric-Retention STT-RAM Caches for Energy-Efficient Multicore Processors

TL;DR

Abstract

ARC: DVFS-Aware Asymmetric-Retention STT-RAM Caches for Energy-Efficient Multicore Processors

Authors

TL;DR

Abstract

Table of Contents

Figures (13)