Table of Contents
Fetching ...

On Error Correction for Nonvolatile Processing-In-Memory

Hüsrev Cılasun, Salonik Resch, Zamshed I. Chowdhury, Masoud Zabihi, Yang Lv, Brandon Zink, Jian-Ping Wang, Sachin S. Sapatnekar, Ulya R. Karpuzcu

TL;DR

This paper revisits the error correction design space for nonvolatile PiM, considering both storage/memory and computation-induced errors, surveying several self-checking and homomorphic approaches and proposing several solutions.

Abstract

Processing in memory (PiM) represents a promising computing paradigm to enhance performance of numerous data-intensive applications. Variants performing computing directly in emerging nonvolatile memories can deliver very high energy efficiency. PiM architectures directly inherit the vulnerabilities of the underlying memory substrates, but they also are subject to errors due to the computation in place. Numerous well-established error correcting codes (ECC) for memory exist, and are also considered in the PiM context, however, they typically ignore errors that occur throughout computation. In this paper we revisit the error correction design space for nonvolatile PiM, considering both storage/memory and computation-induced errors, surveying several self-checking and homomorphic approaches. We propose several solutions and analyze their complex performance-area-coverage trade-off, using three representative nonvolatile PiM technologies. All of these solutions guarantee single error correction for both, bulk bitwise computations and ordinary memory/storage errors.

On Error Correction for Nonvolatile Processing-In-Memory

TL;DR

This paper revisits the error correction design space for nonvolatile PiM, considering both storage/memory and computation-induced errors, surveying several self-checking and homomorphic approaches and proposing several solutions.

Abstract

Processing in memory (PiM) represents a promising computing paradigm to enhance performance of numerous data-intensive applications. Variants performing computing directly in emerging nonvolatile memories can deliver very high energy efficiency. PiM architectures directly inherit the vulnerabilities of the underlying memory substrates, but they also are subject to errors due to the computation in place. Numerous well-established error correcting codes (ECC) for memory exist, and are also considered in the PiM context, however, they typically ignore errors that occur throughout computation. In this paper we revisit the error correction design space for nonvolatile PiM, considering both storage/memory and computation-induced errors, surveying several self-checking and homomorphic approaches. We propose several solutions and analyze their complex performance-area-coverage trade-off, using three representative nonvolatile PiM technologies. All of these solutions guarantee single error correction for both, bulk bitwise computations and ordinary memory/storage errors.
Paper Structure (20 sections, 7 equations, 9 figures, 5 tables)

This paper contains 20 sections, 7 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: Logic gate construction in a row of (a) ReRAM kvatinsky2014magic, (b) STT-MRAM chowdhury2017efficient, (c) SOT/SHE-MRAM zabihi2018memory arrays; (d) Electrical equivalent, circuit symbol and truth table in terms of input and output resistance levels (for STT/SHE-MRAM as a representative example) for 2-input NOR. $V_{bias}$ is a gate-specific voltage applied to Bit Select Lines (BSL). (b) and (c) make a distinction between even (EBSL) and odd (OBSL) BSLs. (c) also distinguishes between Word Lines (WL) for read (WLR) and writes (WLW).
  • Figure 2: Check symbol layout.
  • Figure 3: (a) Overall system architecture. (b) Main computation in (one row of) memory with logic levels explicitly shown. (c) Main computation (in one row of memory) interleaved with error detection/correction as performed by Checker blocks. For error detection, ECiM Checker blocks are in charge of syndrome generation; TRiM Checker blocks, majority vote calculation. Metadata translates into parity bits for ECiM; two redundant computation outputs, for TRiM.
  • Figure 4: Operation layout (a) and timing (b). Each row independently computes the same gates, logic level by logic level, on different data. To overlap computations in one row with reads or writes in other rows, computations in each row start in a delayed fashion (b). More specifically, recall that we use universal NOR gates as core building blocks for computation; all computations are synthesized using NOR. With delayed start, when $r^\text{th}$ row executes $s^\text{th}$NOR, $(r+1)^\text{st}$ row would execute $(s-1)^\text{st}$NOR of the same level, on different data.
  • Figure 5: Timing diagram for parity updates vs. main computation. Each waveform captures the activity in a specific block over time. The hatch pattern depicts an idle block. The computation (NOR) that triggered the parity update (i.e., the two steps of XOR) and the corresponding parity update operations are labeled using the same color. NOR* and NOR in this diagram point to the very same NOR gate in actual computation, whereNOR* indicates the calculation of the second output of $\texttt{NOR}^\texttt{n}_\texttt{22}$ in a different block. $\texttt{XOR}_1$ and $\texttt{XOR}_2$ denote the two steps of XOR; NOR$_\texttt{22}$ and THR, respectively. For each gate, the corresponding step in computation is indicated in parenthesis: NOR(n+1) is the NOR gate initiated at Step n+1.
  • ...and 4 more figures