Table of Contents
Fetching ...

RED: Energy Optimization Framework for eDRAM-based PIM with Reconfigurable Voltage Swing and Retention-aware Scheduling

Jae-Young Kim, Donghyuk Kim, Seungjae Yoo, Sungyeob Yoo, Teokkyu Suh, Joo-Young Kim

TL;DR

This work addresses the high energy cost of memory in eDRAM-based PIM for Transformer workloads by introducing RED, a framework that combines retention-aware scheduling with a reconfigurable eDRAM memory subsystem. RED pre-estimates energy across tiling schemes and memory operations, identifies the most energy-efficient configuration, and then dynamically tunes the memory via VPD-controlled RBL swing and sense-amplifier gating, along with refresh-skipping strategies. The framework achieves up to 3.05× energy efficiency improvements over Neural Cache and up to 8.16× over eDRAM baselines, while maintaining modest area overhead (~3.5%) and low scheduling energy. These results demonstrate that careful memory-design integration—particularly memory-access power optimization and workload-aware tiling—can unlock substantial energy savings for memory-bound AI accelerators, with broad applicability beyond the tested Transformer workloads.

Abstract

In the era of artificial intelligence (AI), Transformer demonstrates its performance across various applications. The excessive amount of parameters incurs high latency and energy overhead when processed in the von Neumann architecture. Processing-in-memory (PIM) has shown the potential in accelerating data-intensive applications by reducing data movement. While previous works mainly optimize the computational part of PIM to enhance energy efficiency, the importance of memory design, which consumes the most power in PIM, has been rather neglected. In this work, we present RED, an energy optimization framework for eDRAM-based PIM. We first analyze the PIM operations in eDRAM, obtaining two key observations: 1) memory access energy consumption is predominant in PIM, and 2) read bitline (RBL) voltage swing, sense amplifier power, and retention time are in trade-off relations. Leveraging them, we propose a novel reconfigurable eDRAM and retention-aware scheduling that minimizes the runtime energy consumption of the eDRAM macro. The framework pinpoints the optimal operating point by pre-estimating energy consumption across all possible tiling schemes and memory operations. Then, the reconfigurable eDRAM controls the RBL voltage swing at runtime according to the scheduling, optimizing the memory access power. Moreover, RED employs refresh skipping and sense amplifier power gating to mitigate the energy consumption overhead coming from the trade-off relation. Finally, the RED framework achieves up to 3.05x higher energy efficiency than the prior SRAM-based PIM, reducing the energy consumption of eDRAM macro up to 74.88% with reconfigurable eDRAM and optimization schemes, requiring only 3.5% area and 0.77% energy overhead for scheduling.

RED: Energy Optimization Framework for eDRAM-based PIM with Reconfigurable Voltage Swing and Retention-aware Scheduling

TL;DR

This work addresses the high energy cost of memory in eDRAM-based PIM for Transformer workloads by introducing RED, a framework that combines retention-aware scheduling with a reconfigurable eDRAM memory subsystem. RED pre-estimates energy across tiling schemes and memory operations, identifies the most energy-efficient configuration, and then dynamically tunes the memory via VPD-controlled RBL swing and sense-amplifier gating, along with refresh-skipping strategies. The framework achieves up to 3.05× energy efficiency improvements over Neural Cache and up to 8.16× over eDRAM baselines, while maintaining modest area overhead (~3.5%) and low scheduling energy. These results demonstrate that careful memory-design integration—particularly memory-access power optimization and workload-aware tiling—can unlock substantial energy savings for memory-bound AI accelerators, with broad applicability beyond the tested Transformer workloads.

Abstract

In the era of artificial intelligence (AI), Transformer demonstrates its performance across various applications. The excessive amount of parameters incurs high latency and energy overhead when processed in the von Neumann architecture. Processing-in-memory (PIM) has shown the potential in accelerating data-intensive applications by reducing data movement. While previous works mainly optimize the computational part of PIM to enhance energy efficiency, the importance of memory design, which consumes the most power in PIM, has been rather neglected. In this work, we present RED, an energy optimization framework for eDRAM-based PIM. We first analyze the PIM operations in eDRAM, obtaining two key observations: 1) memory access energy consumption is predominant in PIM, and 2) read bitline (RBL) voltage swing, sense amplifier power, and retention time are in trade-off relations. Leveraging them, we propose a novel reconfigurable eDRAM and retention-aware scheduling that minimizes the runtime energy consumption of the eDRAM macro. The framework pinpoints the optimal operating point by pre-estimating energy consumption across all possible tiling schemes and memory operations. Then, the reconfigurable eDRAM controls the RBL voltage swing at runtime according to the scheduling, optimizing the memory access power. Moreover, RED employs refresh skipping and sense amplifier power gating to mitigate the energy consumption overhead coming from the trade-off relation. Finally, the RED framework achieves up to 3.05x higher energy efficiency than the prior SRAM-based PIM, reducing the energy consumption of eDRAM macro up to 74.88% with reconfigurable eDRAM and optimization schemes, requiring only 3.5% area and 0.77% energy overhead for scheduling.

Paper Structure

This paper contains 18 sections, 3 equations, 14 figures, 1 table.

Figures (14)

  • Figure 1: (a) PIM Macro (b) Power Consumption Breakdown
  • Figure 2: (a) Memory and 2T eDRAM Cell Structure (b) Power Breakdown of eDRAM Access
  • Figure 3: Breakdown of Memory Energy Consumption on Different Use Cases
  • Figure 4: (a) eDRAM Operation with Large RBL Voltage Swing (b) eDRAM Operation with Small RBL Voltage Swing (c) Trade-off Among RBL Voltage Swing, Sense Amplifier Power, and Retention Time Analysis
  • Figure 5: Overview of The RED Framework
  • ...and 9 more figures