In-Memory Mirroring: Cloning Without Reading
Simranjeet Singh, Ankit Bende, Chandan Kumar Jha, Vikas Rana, Rolf Drechsler, Sachin Patkar, Farhad Merchant
TL;DR
In-memory mirroring (IMM) enables data cloning inside resistive memory crossbars to resolve data dependencies without engaging energy-intensive read and write-back cycles. By exploiting 1T1R RRAM cells and voltage-configured row/column schemes, IMM supports bit-level and word-level cloning with parallelism, achieving a complexity of $\\mathcal{O}(1)$ for word cloning. SPICE-level validation with the JART VCM v1b model demonstrates substantial energy savings (approximately $10$–$11$ pJ per clone) and a twofold speedup over conventional copying methods, highlighting IMM's potential to dramatically improve energy efficiency and performance in LiM architectures. The study also discusses initialization, compiler-level dependency management, and outlines future experimental validation on fabricated crossbars to confirm practicality."
Abstract
In-memory computing (IMC) has gained significant attention recently as it attempts to reduce the impact of memory bottlenecks. Numerous schemes for digital IMC are presented in the literature, focusing on logic operations. Often, an application's description has data dependencies that must be resolved. Contemporary IMC architectures perform read followed by write operations for this purpose, which results in performance and energy penalties. To solve this fundamental problem, this paper presents in-memory mirroring (IMM). IMM eliminates the need for read and write-back steps, thus avoiding energy and performance penalties. Instead, we perform data movement within memory, involving row-wise and column-wise data transfers. Additionally, the IMM scheme enables parallel cloning of entire row (word) with a complexity of $\mathcal{O}(1)$. Moreover, our analysis of the energy consumption of the proposed technique using resistive random-access memory crossbar and experimentally validated JART VCM v1b model. The IMM increases energy efficiency and shows 2$\times$ performance improvement compared to conventional data movement methods.
