VeriGen: A Large Language Model for Verilog Code Generation

Shailja Thakur; Baleegh Ahmad; Hammond Pearce; Benjamin Tan; Brendan Dolan-Gavitt; Ramesh Karri; Siddharth Garg

VeriGen: A Large Language Model for Verilog Code Generation

Shailja Thakur, Baleegh Ahmad, Hammond Pearce, Benjamin Tan, Brendan Dolan-Gavitt, Ramesh Karri, Siddharth Garg

TL;DR

VeriGen demonstrates that fine-tuning open-source LLMs on a large Verilog corpus enables competitive, hardware-focused code generation. By combining GitHub Verilog code and Verilog textbooks, the authors build a robust training and evaluation pipeline, using hand-crafted and HDLBits-based problem sets with comprehensive test benches. The results show that CodeGen-16B-FT delivers strong performance across problem difficulties, often surpassing pre-trained baselines and approaching or matching larger commercial models in several scenarios, while offering faster inference and open-access checkpoints. The work highlights the practical potential of smaller, in-house LLMs for HDL design automation, while acknowledging remaining challenges in achieving full functional correctness without human refinement and stressing the value of richer training data and prompt engineering.

Abstract

In this study, we explore the capability of Large Language Models (LLMs) to automate hardware design by generating high-quality Verilog code, a common language for designing and modeling digital systems. We fine-tune pre-existing LLMs on Verilog datasets compiled from GitHub and Verilog textbooks. We evaluate the functional correctness of the generated Verilog code using a specially designed test suite, featuring a custom problem set and testing benches. Here, our fine-tuned open-source CodeGen-16B model outperforms the commercial state-of-the-art GPT-3.5-turbo model with a 1.1% overall increase. Upon testing with a more diverse and complex problem set, we find that the fine-tuned model shows competitive performance against state-of-the-art gpt-3.5-turbo, excelling in certain scenarios. Notably, it demonstrates a 41% improvement in generating syntactically correct Verilog code across various problem categories compared to its pre-trained counterpart, highlighting the potential of smaller, in-house LLMs in hardware design automation.

VeriGen: A Large Language Model for Verilog Code Generation

TL;DR

Abstract

Paper Structure (27 sections, 19 figures, 5 tables)

This paper contains 27 sections, 19 figures, 5 tables.

Introduction
Background and Related Work
Background
Prior Work
LLM Training
Verilog Training Corpus
GitHub Corpus
Verilog Books Corpus
Baseline LLM Architectures
LLM fine-tuning
LLM Evaluation Setup
Problem Sets
LLM Inference
Input Parameters
Test benches
...and 12 more sections

Figures (19)

Figure 1: Experimental Evaluation of LLM Verilog Completions
Figure 2: Varying the prompt details: Low, Medium and High. Set I, Problem 15.
Figure 3: Set II, Vibrate & ring problem. Difficulty: Circuits (Combinational), Basic category. We highlight the mistake. motor turns on even with ringer set to 0 and vibrate_mode set to 1
Figure 4: Set I, Basic example - Problem 3: A 3-bit priority encoder. We highlight the mistake. Positions are offset by 1.
Figure 5: Set I, Intermediate example - Problem 6: A 1 to 12 counter. We highlight the mistake. Counter does not stop at 12.
...and 14 more figures

VeriGen: A Large Language Model for Verilog Code Generation

TL;DR

Abstract

VeriGen: A Large Language Model for Verilog Code Generation

Authors

TL;DR

Abstract

Table of Contents

Figures (19)