Table of Contents
Fetching ...

Rome was Not Built in a Single Step: Hierarchical Prompting for LLM-based Chip Design

Andre Nakkab, Sai Qian Zhang, Ramesh Karri, Siddharth Garg

TL;DR

The paper tackles the challenge of generating complex HDL designs with LLMs by introducing hierarchical prompting and the Recurrent Optimization via Machine Editing (ROME) pipeline. It presents two prompting modes—HDHP (human-defined hierarchy) and PGHP (purely generative hierarchy)—and a three-phase workflow: hierarchy extraction, submodule implementation with automated testing, and top-level integration, all aided by automated feedback. Through a benchmark suite of hierarchical designs and evaluations across eight LLMs, the work demonstrates substantial improvements in design success rates and reductions in generation time, including case studies of fully automated processor designs and a first purely LLM-designed processor. These findings show that hierarchical prompting can enable smaller open-source models to rival large proprietary models and pave the way for scalable, automated hardware design using LLMs.

Abstract

Large Language Models (LLMs) are effective in computer hardware synthesis via hardware description language (HDL) generation. However, LLM-assisted approaches for HDL generation struggle when handling complex tasks. We introduce a suite of hierarchical prompting techniques which facilitate efficient stepwise design methods, and develop a generalizable automation pipeline for the process. To evaluate these techniques, we present a benchmark set of hardware designs which have solutions with or without architectural hierarchy. Using these benchmarks, we compare various open-source and proprietary LLMs, including our own fine-tuned Code Llama-Verilog model. Our hierarchical methods automatically produce successful designs for complex hardware modules that standard flat prompting methods cannot achieve, allowing smaller open-source LLMs to compete with large proprietary models. Hierarchical prompting reduces HDL generation time and yields savings on LLM costs. Our experiments detail which LLMs are capable of which applications, and how to apply hierarchical methods in various modes. We explore case studies of generating complex cores using automatic scripted hierarchical prompts, including the first-ever LLM-designed processor with no human feedback. Tools for the Recurrent Optimization via Machine Editing (ROME) method can be found at https://github.com/ajn313/ROME-LLM

Rome was Not Built in a Single Step: Hierarchical Prompting for LLM-based Chip Design

TL;DR

The paper tackles the challenge of generating complex HDL designs with LLMs by introducing hierarchical prompting and the Recurrent Optimization via Machine Editing (ROME) pipeline. It presents two prompting modes—HDHP (human-defined hierarchy) and PGHP (purely generative hierarchy)—and a three-phase workflow: hierarchy extraction, submodule implementation with automated testing, and top-level integration, all aided by automated feedback. Through a benchmark suite of hierarchical designs and evaluations across eight LLMs, the work demonstrates substantial improvements in design success rates and reductions in generation time, including case studies of fully automated processor designs and a first purely LLM-designed processor. These findings show that hierarchical prompting can enable smaller open-source models to rival large proprietary models and pave the way for scalable, automated hardware design using LLMs.

Abstract

Large Language Models (LLMs) are effective in computer hardware synthesis via hardware description language (HDL) generation. However, LLM-assisted approaches for HDL generation struggle when handling complex tasks. We introduce a suite of hierarchical prompting techniques which facilitate efficient stepwise design methods, and develop a generalizable automation pipeline for the process. To evaluate these techniques, we present a benchmark set of hardware designs which have solutions with or without architectural hierarchy. Using these benchmarks, we compare various open-source and proprietary LLMs, including our own fine-tuned Code Llama-Verilog model. Our hierarchical methods automatically produce successful designs for complex hardware modules that standard flat prompting methods cannot achieve, allowing smaller open-source LLMs to compete with large proprietary models. Hierarchical prompting reduces HDL generation time and yields savings on LLM costs. Our experiments detail which LLMs are capable of which applications, and how to apply hierarchical methods in various modes. We explore case studies of generating complex cores using automatic scripted hierarchical prompts, including the first-ever LLM-designed processor with no human feedback. Tools for the Recurrent Optimization via Machine Editing (ROME) method can be found at https://github.com/ajn313/ROME-LLM
Paper Structure (19 sections, 2 equations, 7 figures, 5 tables)

This paper contains 19 sections, 2 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Automatic Hierarchical Prompting Pipeline.
  • Figure 2: Structure of a hierarchical step, which uses automated prompting and existing hierarchy to generate a new module.
  • Figure 3: Example of Hierarchical Verilog Decoder Implementation using text-completion LLM. Model outputs are in blue.
  • Figure 4: Hierarchical prompting yields consistent time savings vs. flat prompting, as seen by average % latency reduction. More time savings are seen on modules which are difficult or impossible to generate non-hierarchically, or on those for which flat outputs are longer than hierarchical alternatives.
  • Figure 5: Perseveration-like behavior in GPT-3.5 when asked for rare syntax.
  • ...and 2 more figures