IMPACT:InMemory ComPuting Architecture Based on Y-FlAsh Technology for Coalesced Tsetlin Machine Inference
Omar Ghazal, Wei Wang, Shahar Kvatinsky, Farhad Merchant, Alex Yakovlev, Rishad Shafik
TL;DR
The paper addresses data movement and energy bottlenecks in ML by proposing IMPACT, an in-memory computing architecture built on Y-Flash memristors to execute coalesced Tsetlin machine (CoTM) inference. It couples a clause crossbar (Boolean TA-driven clause computation) with a class crossbar (analog weight-based classification), mapping TA actions to HCS/LCS and weights to conductances to realize $V = W \cdot C$ with $Y_i = 1$ if $V_i > 0$. On MNIST, IMPACT achieves 96.3% accuracy with competitive energy efficiency (TOPS/W ≈ 24.56, TOPS/mm$^2$ ≈ 0.17) and low datapoint energy (~67.99 pJ for the clause tile and ~16.22 pJ for the class tile) while demonstrating robust D2D and C2C variability performance. The architecture highlights scalable, energy-aware in-memory inference using two-terminal Y-Flash devices that self-select to suppress sneak paths, supporting future expansion to larger datasets like CIFAR-10 and ImageNet.
Abstract
The increasing demand for processing large volumes of data for machine learning models has pushed data bandwidth requirements beyond the capability of traditional von Neumann architecture. In-memory computing (IMC) has recently emerged as a promising solution to address this gap by enabling distributed data storage and processing at the micro-architectural level, significantly reducing both latency and energy. In this paper, we present the IMPACT: InMemory ComPuting Architecture Based on Y-FlAsh Technology for Coalesced Tsetlin Machine Inference, underpinned on a cutting-edge memory device, Y-Flash, fabricated on a 180 nm CMOS process. Y-Flash devices have recently been demonstrated for digital and analog memory applications, offering high yield, non-volatility, and low power consumption. The IMPACT leverages the Y-Flash array to implement the inference of a novel machine learning algorithm: coalesced Tsetlin machine (CoTM) based on propositional logic. CoTM utilizes Tsetlin automata (TA) to create Boolean feature selections stochastically across parallel clauses. The IMPACT is organized into two computational crossbars for storing the TA and weights. Through validation on the MNIST dataset, IMPACT achieved 96.3% accuracy. The IMPACT demonstrated improvements in energy efficiency, e.g., 2.23X over CNN-based ReRAM, 2.46X over Neuromorphic using NOR-Flash, and 2.06X over DNN-based PCM, suited for modern ML inference applications.
