On the Viability of using LLMs for SW/HW Co-Design: An Example in Designing CiM DNN Accelerators
Zheyu Yan, Yifan Qin, Xiaobo Sharon Hu, Yiyu Shi
TL;DR
This paper investigates using Large Language Models (LLMs) as optimizers for software-hardware co-design of Compute-in-Memory (CiM) DNN accelerators. It introduces LCDA, a framework that couples a GPT-4-based design optimizer with a design generator, performance evaluator, and cost evaluator to accelerate co-design over the NACIM baseline. Experiments on CIFAR-10 demonstrate up to $25\times$ faster design discovery with comparable accuracy and energy-latency trade-offs, though some CiM-specific behaviors require task-specific tuning. The work highlights the potential of LLM-informed co-design and outlines directions for explainability, open-source LLMs, and tooling to further advance this approach.
Abstract
Deep Neural Networks (DNNs) have demonstrated impressive performance across a wide range of tasks. However, deploying DNNs on edge devices poses significant challenges due to stringent power and computational budgets. An effective solution to this issue is software-hardware (SW-HW) co-design, which allows for the tailored creation of DNN models and hardware architectures that optimally utilize available resources. However, SW-HW co-design traditionally suffers from slow optimization speeds because their optimizers do not make use of heuristic knowledge, also known as the ``cold start'' problem. In this study, we present a novel approach that leverages Large Language Models (LLMs) to address this issue. By utilizing the abundant knowledge of pre-trained LLMs in the co-design optimization process, we effectively bypass the cold start problem, substantially accelerating the design process. The proposed method achieves a significant speedup of 25x. This advancement paves the way for the rapid and efficient deployment of DNNs on edge devices.
