Simultaneous Many-Row Activation in Off-the-Shelf DRAM Chips: Experimental Characterization and Analysis
Ismail Emir Yuksel, Yahya Can Tugrul, F. Nisa Bostanci, Geraldo F. Oliveira, A. Giray Yaglikci, Ataberk Olgun, Melina Soysal, Haocong Luo, Juan Gómez-Luna, Mohammad Sadrosadati, Onur Mutlu
TL;DR
The paper investigates the computational potential of COTS DDR4 DRAM chips by experimentally characterizing 120 chips across 18 modules, focusing on simultaneous activation of up to $32$ rows, MAJ$X$ operations with $X \\in \\{3,5,7,9\\}$, and Multi-/RowCopy functionality. It reveals that input operand replication, a hypothesized hierarchical row-decoder, and carefully tuned timing enable robust in-DRAM computations despite variability in data patterns, temperature, and voltage. The study delivers 18 empirical observations and 7 takeaways, including strong positive results for MAJ$5$/$7$/$9$ and near-perfect performance for Multi-/RowCopy in many configurations, while also noting limitations across some chip vendors (e.g., Samsung). The authors also provide SPICE-backed insight into the circuit-level mechanisms and demonstrate practical case studies like seven MAJX-based microbenchmarks and cold-boot attack prevention via rapid content destruction, underscoring the potential and challenges of integrating DRAM-based computation into future systems.
Abstract
We experimentally analyze the computational capability of commercial off-the-shelf (COTS) DRAM chips and the robustness of these capabilities under various timing delays between DRAM commands, data patterns, temperature, and voltage levels. We extensively characterize 120 COTS DDR4 chips from two major manufacturers. We highlight four key results of our study. First, COTS DRAM chips are capable of 1) simultaneously activating up to 32 rows (i.e., simultaneous many-row activation), 2) executing a majority of X (MAJX) operation where X>3 (i.e., MAJ5, MAJ7, and MAJ9 operations), and 3) copying a DRAM row (concurrently) to up to 31 other DRAM rows, which we call Multi-RowCopy. Second, storing multiple copies of MAJX's input operands on all simultaneously activated rows drastically increases the success rate (i.e., the percentage of DRAM cells that correctly perform the computation) of the MAJX operation. For example, MAJ3 with 32-row activation (i.e., replicating each MAJ3's input operands 10 times) has a 30.81% higher average success rate than MAJ3 with 4-row activation (i.e., no replication). Third, data pattern affects the success rate of MAJX and Multi-RowCopy operations by 11.52% and 0.07% on average. Fourth, simultaneous many-row activation, MAJX, and Multi-RowCopy operations are highly resilient to temperature and voltage changes, with small success rate variations of at most 2.13% among all tested operations. We believe these empirical results demonstrate the promising potential of using DRAM as a computation substrate. To aid future research and development, we open-source our infrastructure at https://github.com/CMU-SAFARI/SiMRA-DRAM.
