Harnessing Nonidealities in Analog In-Memory Computing Circuits: A Physical Modeling Approach for Neuromorphic Systems
Yusuke Sakemi, Yuji Okamoto, Takashi Morie, Sou Nobukawa, Takeo Hosomi, Kazuyuki Aihara
TL;DR
The paper tackles the energy efficiency barrier of large-scale deep learning by adopting a bottom-up, physics-aware approach that directly models IMC nonidealities as ODE-based physical neural networks (PNNs). It introduces differentiable spike-time discretization (DSTD) to enable scalable training of these IMC-aware networks while capturing reversal-potential dynamics, and demonstrates that such nonidealities can be harnessed to improve learning rather than merely degrade performance. ThroughFashion-MNIST and CIFAR-10 experiments and post-layout sky130 SPICE validation, the authors show that reversal potentials can be exploited when incorporated into training, and that hardware-aware PNN models closely match circuit dynamics, reducing modeling error by orders of magnitude compared with top-down mappings. The work offers a pathway to energy-efficient neuromorphic computing by integrating IMC nonidealities into the learning process and delivering substantial training speedups and memory reductions.
Abstract
Large-scale deep learning models are increasingly constrained by their immense energy consumption, limiting their scalability and applicability for edge intelligence. In-memory computing (IMC) offers a promising solution by addressing the von Neumann bottleneck inherent in traditional deep learning accelerators, significantly reducing energy consumption. However, the analog nature of IMC introduces hardware nonidealities that degrade model performance and reliability. This paper presents a novel approach to directly train physical models of IMC, formulated as ordinary-differential-equation (ODE)-based physical neural networks (PNNs). To enable the training of large-scale networks, we propose a technique called differentiable spike-time discretization (DSTD), which reduces the computational cost of ODE-based PNNs by up to 20 times in speed and 100 times in memory. We demonstrate that such large-scale networks enhance the learning performance by exploiting hardware nonidealities on the CIFAR-10 dataset. The proposed bottom-up methodology is validated through the post-layout SPICE simulations on the IMC circuit with nonideal characteristics using the sky130 process. The proposed PNN approach reduces the discrepancy between the model behavior and circuit dynamics by at least an order of magnitude. This work paves the way for leveraging nonideal physical devices, such as non-volatile resistive memories, for energy-efficient deep learning applications.
