Table of Contents
Fetching ...

Large Language Model for Verilog Code Generation: Literature Review and the Road Ahead

Guang Yang, Wei Zheng, Xiang Chen, Dong Liang, Peng Hu, Yukui Yang, Shaohang Peng, Zhenghan Li, Jiahui Feng, Xiao Wei, Kexin Sun, Deyuan Ma, Haotian Cheng, Yiheng Shen, Xing Hu, Terry Yue Zhuo, David Lo

TL;DR

This systematic literature review surveys 102 studies (2020–2025) on using large language models (LLMs) for Verilog code generation at RTL, organizing work into base LLMs and instruction-tuned LLMs (IT LLMs). It analyzes datasets and evaluation metrics, categorizing benchmarks vs instruct-tuning data, and details a wide spectrum of adaptation techniques from prompt engineering to reinforcement learning, along with alignment concerns (security, efficiency, copyright, hallucinations). The review identifies key limitations—hardware awareness gaps, data quality and coverage issues, and insufficient benchmarks—and offers a roadmap emphasizing multimodal inputs, PPA-aware optimization, executable evaluation, and industrial workflow integration. Overall, the work highlights substantial progress and practical opportunities to advance automated hardware design via LLMs, while calling for standardized benchmarks and deployment-ready alignment frameworks. The proposed road map envisions staged exploration from module-level generation to full system-level RTL, with interactive, explainable generation and stronger collaboration with hardware engineers.

Abstract

Code generation has emerged as a critical research area at the intersection of Software Engineering (SE) and Artificial Intelligence (AI), attracting significant attention from both academia and industry. Within this broader landscape, Verilog, as a representative hardware description language (HDL), plays a fundamental role in digital circuit design and verification, making its automated generation particularly significant for Electronic Design Automation (EDA). Consequently, recent research has increasingly focused on applying Large Language Models (LLMs) to Verilog code generation, particularly at the Register Transfer Level (RTL), exploring how these AI-driven techniques can be effectively integrated into hardware design workflows. Despite substantial research efforts have explored LLM applications in this domain, a comprehensive survey synthesizing these developments remains absent from the literature. This review fill addresses this gap by providing a systematic literature review of LLM-based methods for Verilog code generation, examining their effectiveness, limitations, and potential for advancing automated hardware design. The review encompasses research work from conferences and journals in the fields of SE, AI, and EDA, encompassing 70 papers published on venues, along with 32 high-quality preprint papers, bringing the total to 102 papers. By answering four key research questions, we aim to (1) identify the LLMs used for Verilog generation, (2) examine the datasets and metrics employed in evaluation, (3) categorize the techniques proposed for Verilog generation, and (4) analyze LLM alignment approaches for Verilog generation. Based on our findings, we have identified a series of limitations of existing studies. Finally, we have outlined a roadmap highlighting potential opportunities for future research endeavors in LLM-assisted hardware design.

Large Language Model for Verilog Code Generation: Literature Review and the Road Ahead

TL;DR

This systematic literature review surveys 102 studies (2020–2025) on using large language models (LLMs) for Verilog code generation at RTL, organizing work into base LLMs and instruction-tuned LLMs (IT LLMs). It analyzes datasets and evaluation metrics, categorizing benchmarks vs instruct-tuning data, and details a wide spectrum of adaptation techniques from prompt engineering to reinforcement learning, along with alignment concerns (security, efficiency, copyright, hallucinations). The review identifies key limitations—hardware awareness gaps, data quality and coverage issues, and insufficient benchmarks—and offers a roadmap emphasizing multimodal inputs, PPA-aware optimization, executable evaluation, and industrial workflow integration. Overall, the work highlights substantial progress and practical opportunities to advance automated hardware design via LLMs, while calling for standardized benchmarks and deployment-ready alignment frameworks. The proposed road map envisions staged exploration from module-level generation to full system-level RTL, with interactive, explainable generation and stronger collaboration with hardware engineers.

Abstract

Code generation has emerged as a critical research area at the intersection of Software Engineering (SE) and Artificial Intelligence (AI), attracting significant attention from both academia and industry. Within this broader landscape, Verilog, as a representative hardware description language (HDL), plays a fundamental role in digital circuit design and verification, making its automated generation particularly significant for Electronic Design Automation (EDA). Consequently, recent research has increasingly focused on applying Large Language Models (LLMs) to Verilog code generation, particularly at the Register Transfer Level (RTL), exploring how these AI-driven techniques can be effectively integrated into hardware design workflows. Despite substantial research efforts have explored LLM applications in this domain, a comprehensive survey synthesizing these developments remains absent from the literature. This review fill addresses this gap by providing a systematic literature review of LLM-based methods for Verilog code generation, examining their effectiveness, limitations, and potential for advancing automated hardware design. The review encompasses research work from conferences and journals in the fields of SE, AI, and EDA, encompassing 70 papers published on venues, along with 32 high-quality preprint papers, bringing the total to 102 papers. By answering four key research questions, we aim to (1) identify the LLMs used for Verilog generation, (2) examine the datasets and metrics employed in evaluation, (3) categorize the techniques proposed for Verilog generation, and (4) analyze LLM alignment approaches for Verilog generation. Based on our findings, we have identified a series of limitations of existing studies. Finally, we have outlined a roadmap highlighting potential opportunities for future research endeavors in LLM-assisted hardware design.

Paper Structure

This paper contains 90 sections, 7 equations, 1 figure, 5 tables.

Figures (1)

  • Figure 7: Usage of Evaluation Metrics Across Years (Counts from Surveyed Studies).