Artificial Synapse based on ULTRARAM Memory Device for Neuromorphic Applications
Abhishek Kumar, Peter D. Hodgson, Manus Hayne, Avirup Dasgupta
TL;DR
This work demonstrates on-chip training and inference for DNNs using ULTRARAM-based synaptic arrays integrated with CMOS peripherals. A physics-based floating-gate model and a circuit macro-model enable realistic evaluation of area, latency, energy, and accuracy, showing 1.8× area and 1.52× energy improvements over SRAM with 91% training accuracy for 2-bit devices, and strong projections at advanced technology nodes. The architecture employs multi-state ULTRARAM cells within crossbar arrays, a transposable synaptic array with shared ADCs, and gradient-driven weight updates, enabling efficient CNN training on CIFAR-10 with VGG-8. Scaling to 32 nm suggests ULTRARAM-based CIM can outperform SRAM and remain competitive with FeFET- and other analog synapses, underscoring ULTRARAM’s potential as a practical artificial synapse for neuromorphic accelerators.
Abstract
The memory demands of large-scale deep neural networks (DNNs) require synaptic weight values to be stored and updated in off-chip memory like dynamic random-access memory, which reduces energy efficiency and increases training time. Monolithic crossbar or pseudo-crossbar arrays using analog non-volatile memories, which can store and update weights on-chip, present an opportunity to efficiently accelerate DNN training. In this article, we present on-chip training and inference of a neural network using an ULTRARAM memory device-based synaptic array and complementary metal-oxide-semiconductor (CMOS) peripheral circuits. ULTRARAM is a promising emerging memory exhibiting high endurance (>10^7 P/E cycles), ultra-high retention (>1000 years), and ultra-low switching energy per unit area. A physics-based compact model of ULTRARAM memory device has been proposed to capture the real-time trapping/de-trapping of charges in the floating gate (FG) and utilized for the synapse simulations. A circuit-level macro-model is employed to evaluate and benchmark the on-chip learning performance in terms of area, latency, energy, and accuracy of an ULTRARAM synaptic core. In comparison to CMOS-based design, it demonstrates an overall improvement in area and energy by 1.8x and 1.52x, respectively, with 91% of training accuracy.
