A 9T4R RRAM-Based ACAM for Analogue Template Matching at the Edge
Georgios Papandroulidakis, Shady Agwa, Ahmet Cirakoglu, Themis Prodromakis
TL;DR
The paper tackles the energy bottlenecks of data movement in edge AI by proposing an analogue template-matching accelerator built from a 9T4R RRAM-CMOS pixel (TXL) for ACAM. It introduces a dual-threshold per-cell design using hybrid RRAM-CMOS inverters to implement configurable matching windows, enabling near-sensor, analogue template matching with reduced data conversion. A 32×48 TXL-ACAM prototype in 180 nm CMOS with back-end-of-line RRAM demonstrates competitive energy efficiency, achieving approximately 0.16 pJ per match and 0.036 pJ per mismatch per cell at 66 MHz and 3 V, along with programmable read/write functionality and system-level peripherals. The work also analyzes process variability and presents a complete IC design with front-end sampling, analogue drivers, accumulators, sense amplifiers, and serial readout, underscoring the potential of memory-centric accelerators for energy-efficient edge classification with future scaling and benchmarking opportunities.
Abstract
The continuous shift of computational bottlenecks to the memory access and data transfer, especially for AI applications, poses the urgent needs of re-engineering the computer architecture fundamentals. Many edge computing applications, like wearable and implantable medical devices, introduce increasingly more challenges to conventional computing systems due to the strict requirements of area and power at the edge. Emerging technologies, like Resistive RAM (RRAM), have shown a promising momentum in developing neuro-inspired analogue computing paradigms capable of achieving high classification capabilities alongside high energy efficiency. In this work, we present a novel RRAM-based Analogue Content Addressable Memory (ACAM) for on-line analogue template matching applications. This ACAM-based template matching architecture aims to achieve energy-efficient classification where low energy is of utmost importance. We are showcasing a highly tuneable novel RRAM-based ACAM pixel implemented using a commercial 180nm CMOS technology and in-house RRAM technology and exhibiting low energy dissipation of approximately 0.036pJ and 0.16pJ for mismatch and match, respectively, at 66MHz with 3V voltage supply. A proof-of-concept system-level implementation based on this novel pixel design is also implemented in 180nm.
