Table of Contents
Fetching ...

STRIDe: Cross-Coupled STT-MRAM Enabling Robust In-Memory-Computing for Deep Neural Network Accelerators

Imtiaz Ahmed, Sumeet Kumar Gupta

Abstract

As deep neural network (DNN) models are growing exponentially in size, their deployment on resource-constrained edge platforms is becoming increasingly challenging. In-memory-computing (IMC) with non-volatile memories (NVMs) has emerged as a potential solution by virtue of its higher energy efficiency compared to standard DNN hardware platforms. Amongst various NVMs, STT-MRAM is highly promising owing to its high endurance and other benefits. However, their IMC implementation is challenging because of their inherently low distinguishability. This issue is exacerbated due to array non-idealities and process-variations, leading to poor IMC robustness and severe inference accuracy degradation. To address this problem, we propose STRIDe - STT-MRAM-based IMC leveraging cross-coupling action to boost the bitcell-level high-to-low current ratio to up to 8000. We propose two flavors of STRIDe designs, both offering robust IMC for inputs and weights in {-1, 1}(XNOR-IMC) and {0, 1}(AND-IMC) regime. Our evaluations for STRIDe arrays show up to 3.86x and 1.77x sense margin (SM) improvement for XNOR-IMC and AND-IMC, respectively, and up to 27.6% read disturb margin (RDM) improvement over standard MRAM-IMC designs. The enhanced robustness of STRIDe translates to near-software inference accuracies (considering crossbar non-idealities and process variations) for ResNet18 BNN and 4-bit DNN trained on CIFAR10 dataset. We observe accuracy improvements of up to 70% (for BNN) and up to 35%(for 4-bit DNN) over standard MRAM designs, albeit with some energy-area-latency penalty.

STRIDe: Cross-Coupled STT-MRAM Enabling Robust In-Memory-Computing for Deep Neural Network Accelerators

Abstract

As deep neural network (DNN) models are growing exponentially in size, their deployment on resource-constrained edge platforms is becoming increasingly challenging. In-memory-computing (IMC) with non-volatile memories (NVMs) has emerged as a potential solution by virtue of its higher energy efficiency compared to standard DNN hardware platforms. Amongst various NVMs, STT-MRAM is highly promising owing to its high endurance and other benefits. However, their IMC implementation is challenging because of their inherently low distinguishability. This issue is exacerbated due to array non-idealities and process-variations, leading to poor IMC robustness and severe inference accuracy degradation. To address this problem, we propose STRIDe - STT-MRAM-based IMC leveraging cross-coupling action to boost the bitcell-level high-to-low current ratio to up to 8000. We propose two flavors of STRIDe designs, both offering robust IMC for inputs and weights in {-1, 1}(XNOR-IMC) and {0, 1}(AND-IMC) regime. Our evaluations for STRIDe arrays show up to 3.86x and 1.77x sense margin (SM) improvement for XNOR-IMC and AND-IMC, respectively, and up to 27.6% read disturb margin (RDM) improvement over standard MRAM-IMC designs. The enhanced robustness of STRIDe translates to near-software inference accuracies (considering crossbar non-idealities and process variations) for ResNet18 BNN and 4-bit DNN trained on CIFAR10 dataset. We observe accuracy improvements of up to 70% (for BNN) and up to 35%(for 4-bit DNN) over standard MRAM designs, albeit with some energy-area-latency penalty.

Paper Structure

This paper contains 24 sections, 8 equations, 14 figures, 1 table.

Figures (14)

  • Figure 1: Magnetic Tunnel Junction (MTJ) based STT-MRAM. (a)Layers of MTJ. PL and FL are insulated with oxide (MgO) layer. (b)1T-1MTJ and 2T-2MTJ based STT-MRAM bitcells.(c)STT-MRAM based IMC Crossbar arrays.
  • Figure 2: (a) Circuit diagram for STRIDe-I and II bitcells, (b) STRIDe-I bitcell layout (45nm node), modified from 11133247, (c) STRIDe-II bitcell layout.
  • Figure 3: Direction of switching current required to write into MTJ11133247.
  • Figure 4: Write Operation in STRIDe-I for (a) P-AP programming and (b) AP-P programming. WWL is asserted, turning M4 on and allowing controlled flow of switching current depending on BL/BLB voltages.
  • Figure 5: Two-cycle write of P-AP in STRIDe-II. MTJL and MTJR are written during first and second cycle, respectively. Direction of current flow is controlled with appropriate terminal bias voltages while keeping WL asserted.
  • ...and 9 more figures