Investigating Energy Bounds of Analog Compute-in-Memory with Local Normalization
Brian Rojkov, Shubham Ranjan, Derek Wright, Manoj Sachdev
TL;DR
This work tackles the energy efficiency challenge of analog Compute-in-Memory (CIM) for edge AI, focusing on floating-point workloads where wide dynamic range is necessary. It introduces the Gain-Ranging MAC (GR-MAC), a mixed-signal architecture that performs local normalization of mantissas and uses exponent-weighted gain ranging to decouple input dynamic range from precision, keeping the MAC in a low-precision analog regime. The authors provide architectural designs (unit/row/INT normalization variants) and a comprehensive energy-modeling analysis, showing that ADC energy can be significantly reduced and that the input dynamic range can be expanded without increasing energy at 35 dB SQNR; an upper bound improvement of about 1.5 bits on ADC resolution is demonstrated across realistic distributions. Collectively, GR-MAC offers a pathway to substantially improve energy scaling in FP-CIM, enabling more efficient processing for modern AI workloads such as Large Language Models.
Abstract
Modern edge AI workloads demand maximum energy efficiency, motivating the pursuit of analog Compute-in-Memory (CIM) architectures. Simultaneously, the popularity of Large-Language-Models (LLMs) drives the adoption of low-bit floating-point formats which prioritize dynamic range. However, the conventional direct-accumulation CIM accommodates floating-points by normalizing them to a shared widened fixed-point scale. Consequently, hardware resolution is dictated by the input's dynamic range rather than its precision, and energy consumption is dominated by the ADC. We address this limitation by introducing local normalization for each input, weight, and multiply-accumulate (MAC) output via a Gain-Ranging MAC (GR-MAC). Normalization overhead is handled by low-power digital logic, enabling the computationally expensive MAC operation to remain in the energy-efficient low-precision analog regime. Energy modelling shows that the addition of a gain-ranging Stage to the MAC enables a 4-bit increase in input dynamic range without increased energy consumption at a 35 dB SQNR standard. Additionally, the ADC resolution requirement becomes invariant to input distribution assumptions, allowing construction of an upper bound with a 1.5-bit reduction compared to the conventional lower bound. These results establish a pathway towards unlocking favourable energy scaling trends of analog CIM for modern AI workloads.
