Predictive first-principles simulations for co-designing next-generation energy-efficient AI systems
Denis Mamaluy, Md Rahatul Islam Udoy, Juan P. Mendez, Ben Feinberg, Wei Pan, Ahmedullah Aziz
TL;DR
It is argued that Predictive (first-principles, fitting-parameter-free) device and interconnect simulations can close the loop between nanoscale physics and workload-level metrics, enabling the identification of device/interconnect operating regimes that plausibly support improvements in energy efficiency of AI accelerators.
Abstract
In modern generative-AI workloads, matrix-vector/matrix-matrix multiplications (\emph{MatMul}) dominate the compute and energy cost. Achieving dramatic reductions in energy per token therefore requires a novel, specialized hardware that is co-designed across materials, devices, interconnects, circuits, and architectures rather than optimized at any single layer in isolation. In this \emph{Perspectives} article, we argue that \emph{predictive} (first-principles, fitting-parameter-free) device and interconnect simulations can close the loop between nanoscale physics and workload-level metrics, enabling the identification of device/interconnect operating regimes that plausibly support \emph{orders-of-magnitude} improvements in energy efficiency of AI accelerators.
