Table of Contents
Fetching ...

Cross-layer Modeling and Design of Content Addressable Memories in Advanced Technology Nodes for Similarity Search

Siri Narla, Piyush Kumar, Mohammad Adnaan, Azad Naeemi

TL;DR

This work tackles how interconnect parasitics at the 7 nm node degrade CAM-based similarity search across SRAM-, SOT-MRAM-, and FeFET-based CAMs. It builds a cross-layer modeling flow that extracts device- and layout-level parasitics with ASAP7/nxtgrd, converts them into SPICE models, and maps Hamming distance ($HDist$) to matchline discharge delays, revealing that the delay difference between consecutive $HDist$ values scales as $t_n - t_{n-1} \propto 1/(n(n-1))$ and is sensitive to IR/RC effects. It proposes two mitigations— relocating search lines to upper metal levels (S2x) and RC-delay matching between clock and search paths— and demonstrates application-level gains on dataset search and a sequential recommender, while highlighting density and energy trade-offs for SOT- and FeFET-CAMs versus SRAM-CAM. The study provides a practical cross-layer framework and actionable guidance for deploying CAM-based similarity search in advanced nodes, illustrating when NVM-CAMs offer density benefits with favorable energy profiles.

Abstract

In this paper we present a comprehensive design and benchmarking study of Content Addressable Memory (CAM) at the 7nm technology node in the context of similarity search applications. We design CAM cells based on SRAM, spin-orbit torque, and ferroelectric field effect transistor devices and from their layouts extract cell parasitics using state of the art EDA tools. These parasitics are used to develop SPICE netlists to model search operations. We use a CAM-based dataset search and a sequential recommendation system to highlight the application-level performance degradation due to interconnect parasitics. We propose and evaluate two solutions to mitigate interconnect effects.

Cross-layer Modeling and Design of Content Addressable Memories in Advanced Technology Nodes for Similarity Search

TL;DR

This work tackles how interconnect parasitics at the 7 nm node degrade CAM-based similarity search across SRAM-, SOT-MRAM-, and FeFET-based CAMs. It builds a cross-layer modeling flow that extracts device- and layout-level parasitics with ASAP7/nxtgrd, converts them into SPICE models, and maps Hamming distance () to matchline discharge delays, revealing that the delay difference between consecutive values scales as and is sensitive to IR/RC effects. It proposes two mitigations— relocating search lines to upper metal levels (S2x) and RC-delay matching between clock and search paths— and demonstrates application-level gains on dataset search and a sequential recommender, while highlighting density and energy trade-offs for SOT- and FeFET-CAMs versus SRAM-CAM. The study provides a practical cross-layer framework and actionable guidance for deploying CAM-based similarity search in advanced nodes, illustrating when NVM-CAMs offer density benefits with favorable energy profiles.

Abstract

In this paper we present a comprehensive design and benchmarking study of Content Addressable Memory (CAM) at the 7nm technology node in the context of similarity search applications. We design CAM cells based on SRAM, spin-orbit torque, and ferroelectric field effect transistor devices and from their layouts extract cell parasitics using state of the art EDA tools. These parasitics are used to develop SPICE netlists to model search operations. We use a CAM-based dataset search and a sequential recommendation system to highlight the application-level performance degradation due to interconnect parasitics. We propose and evaluate two solutions to mitigate interconnect effects.
Paper Structure (8 sections, 5 figures, 11 tables)

This paper contains 8 sections, 5 figures, 11 tables.

Figures (5)

  • Figure 1: Cell schematics of (a) SOT-based CAM, (b) SRAM-based CAM and (c) FeFET-based CAM
  • Figure 2: (a)SOT-CAM 2x1 cell layout. This layout shows 2 aligned cells that share WR, WRB and VSS contacts to reduce overall area. Cell size is 6Fx30F (b) SRAM based CAM cell layout. Cell size is 9Fx37.5F (c) FeFET-based CAM cell layout. Cell size is 9Fx15F. F is half M1 metal pitch and is 18nm.
  • Figure 3: The difference in ML discharge delay for HDist=n ( $\mathrm{t_{n}}$) and HDist=n-1 ( $\mathrm{t_{n-1}}$) vs 1/n(n-1). $\mathrm{V_{ML}}$ and $\mathrm{V_{threshold}}$ are matchline precharge voltage and voltage where the sense amplifier detects discharge. C is the ML capacitance and $\mathrm{I_{sat}}$ is the discharge current from a mismatching cell.
  • Figure 4: Percentage increase in delay to discharge MLs with HDist=40 from the row closest to the driver to the row farthest from the driver for (a) SOT-CAM array (b) SRAM-CAM array for Vdd=0.5V and (c) FeFET-CAM arrays with 128 columns and various number of rows. S2x is the case where the searchlines are shifted to M4 and made 2x wider.
  • Figure 5: Minimum detectable distance for (a) SOT-CAM, (b) SRAM-CAM, and (c) FeFET-CAM arrays.