Table of Contents
Fetching ...

Artificial Synapse based on ULTRARAM Memory Device for Neuromorphic Applications

Abhishek Kumar, Peter D. Hodgson, Manus Hayne, Avirup Dasgupta

TL;DR

This work demonstrates on-chip training and inference for DNNs using ULTRARAM-based synaptic arrays integrated with CMOS peripherals. A physics-based floating-gate model and a circuit macro-model enable realistic evaluation of area, latency, energy, and accuracy, showing 1.8× area and 1.52× energy improvements over SRAM with 91% training accuracy for 2-bit devices, and strong projections at advanced technology nodes. The architecture employs multi-state ULTRARAM cells within crossbar arrays, a transposable synaptic array with shared ADCs, and gradient-driven weight updates, enabling efficient CNN training on CIFAR-10 with VGG-8. Scaling to 32 nm suggests ULTRARAM-based CIM can outperform SRAM and remain competitive with FeFET- and other analog synapses, underscoring ULTRARAM’s potential as a practical artificial synapse for neuromorphic accelerators.

Abstract

The memory demands of large-scale deep neural networks (DNNs) require synaptic weight values to be stored and updated in off-chip memory like dynamic random-access memory, which reduces energy efficiency and increases training time. Monolithic crossbar or pseudo-crossbar arrays using analog non-volatile memories, which can store and update weights on-chip, present an opportunity to efficiently accelerate DNN training. In this article, we present on-chip training and inference of a neural network using an ULTRARAM memory device-based synaptic array and complementary metal-oxide-semiconductor (CMOS) peripheral circuits. ULTRARAM is a promising emerging memory exhibiting high endurance (>10^7 P/E cycles), ultra-high retention (>1000 years), and ultra-low switching energy per unit area. A physics-based compact model of ULTRARAM memory device has been proposed to capture the real-time trapping/de-trapping of charges in the floating gate (FG) and utilized for the synapse simulations. A circuit-level macro-model is employed to evaluate and benchmark the on-chip learning performance in terms of area, latency, energy, and accuracy of an ULTRARAM synaptic core. In comparison to CMOS-based design, it demonstrates an overall improvement in area and energy by 1.8x and 1.52x, respectively, with 91% of training accuracy.

Artificial Synapse based on ULTRARAM Memory Device for Neuromorphic Applications

TL;DR

This work demonstrates on-chip training and inference for DNNs using ULTRARAM-based synaptic arrays integrated with CMOS peripherals. A physics-based floating-gate model and a circuit macro-model enable realistic evaluation of area, latency, energy, and accuracy, showing 1.8× area and 1.52× energy improvements over SRAM with 91% training accuracy for 2-bit devices, and strong projections at advanced technology nodes. The architecture employs multi-state ULTRARAM cells within crossbar arrays, a transposable synaptic array with shared ADCs, and gradient-driven weight updates, enabling efficient CNN training on CIFAR-10 with VGG-8. Scaling to 32 nm suggests ULTRARAM-based CIM can outperform SRAM and remain competitive with FeFET- and other analog synapses, underscoring ULTRARAM’s potential as a practical artificial synapse for neuromorphic accelerators.

Abstract

The memory demands of large-scale deep neural networks (DNNs) require synaptic weight values to be stored and updated in off-chip memory like dynamic random-access memory, which reduces energy efficiency and increases training time. Monolithic crossbar or pseudo-crossbar arrays using analog non-volatile memories, which can store and update weights on-chip, present an opportunity to efficiently accelerate DNN training. In this article, we present on-chip training and inference of a neural network using an ULTRARAM memory device-based synaptic array and complementary metal-oxide-semiconductor (CMOS) peripheral circuits. ULTRARAM is a promising emerging memory exhibiting high endurance (>10^7 P/E cycles), ultra-high retention (>1000 years), and ultra-low switching energy per unit area. A physics-based compact model of ULTRARAM memory device has been proposed to capture the real-time trapping/de-trapping of charges in the floating gate (FG) and utilized for the synapse simulations. A circuit-level macro-model is employed to evaluate and benchmark the on-chip learning performance in terms of area, latency, energy, and accuracy of an ULTRARAM synaptic core. In comparison to CMOS-based design, it demonstrates an overall improvement in area and energy by 1.8x and 1.52x, respectively, with 91% of training accuracy.

Paper Structure

This paper contains 8 sections, 3 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Schematic of an ULTRARAM memory cell and the corresponding transmission electron microscope image of the device's epilayers ch7ur1.ted.
  • Figure 2: (a) Validation of model with experimental I-V characteristics ch7ur2.aem. (b) Variations in the memory window (MW) of the device for pulse width and rise/fall time.
  • Figure 3: Architecture-level representation of ON-chip learning hardware.
  • Figure 4: Schematic of the VGG-8 model vgg8_ref used for image classification from the CIFAR-10 dataset vgg8_code.
  • Figure 5: Simulated response of an ULTRARAM cell to (a) identical pulses (same magnitude and pulse width), (b) variable pulse width for a fixed voltage magnitude, and (c) variable amplitude for a fixed pulse width. The number of accessible partial states is maximized when using a variable amplitude pulse scheme ($\sim 32$ states for LTP and LTD).
  • ...and 5 more figures