Table of Contents
Fetching ...

Classification-Based Automatic HDL Code Generation Using LLMs

Wenhao Sun, Bing Li, Grace Li Zhang, Xunzhao Yin, Cheng Zhuo, Ulf Schlichtmann

TL;DR

This paper addresses hallucinations in HDL code generation by introducing a human-expert-inspired, training-free workflow for LLM-based HDL synthesis. The method starts with circuit-type classification, then extracts explicit information lists, and applies type-specific (SEQU/COMB) or general BEHAV generation, guided by a verification-driven search over code candidates. It demonstrates substantial improvements in functional correctness on VerilogEval-human and VerilogEval-machine datasets, highlighting the value of structured task decomposition, explicit information modeling, and budgeted verification over naive one-shot generation. The work offers a practical pathway to more reliable HDL generation with LLMs, reducing reliance on fine-tuning, large databases, or testbench-only feedback loops, and suggesting avenues for further refinement of information-list quality and budget strategies.

Abstract

While large language models (LLMs) have demonstrated the ability to generate hardware description language (HDL) code for digital circuits, they still suffer from the hallucination problem, which leads to the generation of incorrect HDL code or misunderstanding of specifications. In this work, we introduce a human-expert-inspired method to mitigate the hallucination of LLMs and improve the performance in HDL code generation. We first let LLMs classify the type of the circuit based on the specifications. Then, according to the type of the circuit, we split the tasks into several sub-procedures, including information extraction and human-like design flow using Electronic Design Automation (EDA) tools. Besides, we also use a search method to mitigate the variation in code generation. Experimental results show that our method can significantly improve the functional correctness of the generated Verilog and reduce the hallucination of LLMs.

Classification-Based Automatic HDL Code Generation Using LLMs

TL;DR

This paper addresses hallucinations in HDL code generation by introducing a human-expert-inspired, training-free workflow for LLM-based HDL synthesis. The method starts with circuit-type classification, then extracts explicit information lists, and applies type-specific (SEQU/COMB) or general BEHAV generation, guided by a verification-driven search over code candidates. It demonstrates substantial improvements in functional correctness on VerilogEval-human and VerilogEval-machine datasets, highlighting the value of structured task decomposition, explicit information modeling, and budgeted verification over naive one-shot generation. The work offers a practical pathway to more reliable HDL generation with LLMs, reducing reliance on fine-tuning, large databases, or testbench-only feedback loops, and suggesting avenues for further refinement of information-list quality and budget strategies.

Abstract

While large language models (LLMs) have demonstrated the ability to generate hardware description language (HDL) code for digital circuits, they still suffer from the hallucination problem, which leads to the generation of incorrect HDL code or misunderstanding of specifications. In this work, we introduce a human-expert-inspired method to mitigate the hallucination of LLMs and improve the performance in HDL code generation. We first let LLMs classify the type of the circuit based on the specifications. Then, according to the type of the circuit, we split the tasks into several sub-procedures, including information extraction and human-like design flow using Electronic Design Automation (EDA) tools. Besides, we also use a search method to mitigate the variation in code generation. Experimental results show that our method can significantly improve the functional correctness of the generated Verilog and reduce the hallucination of LLMs.
Paper Structure (11 sections, 1 equation, 12 figures, 3 tables)

This paper contains 11 sections, 1 equation, 12 figures, 3 tables.

Figures (12)

  • Figure 1: Comparison of the training-free methods. The naïve method suffers from the hallucination problem. The human-aided method requires human labor and is inefficient. The RAG method relies on the quality of the database which may be expensive to be built. By introducing the human-expert-inspired procedure, the proposed method can mitigate the hallucination of LLMs without human labor or database.
  • Figure 2: The workflow of the proposed method. The first iteration is illustrated in (a), where the specifications are classified. Depending on the circuit type, either the SEQU or COMB procedure is executed and verified with testbench. The further iterations are illustrated in (b) and start at Last Iteration, where the information list is reused and the BEHAV procedure is executed.
  • Figure 3: Comparison of the Pass@k between two configurations, (5,3,2) and (7,2,1), where V.H. represents VerilogEval-human, and V.M. represents VerilogEval-machine. The total results of all the procedures are shown as FULL, and the individual results of the COMB, SEQU and BEHAV procedures are shown as COMB, SEQU and BEHAV, respectively.
  • Figure 4: Comparison of the task error rate distributions in difficult tasks, where for each task, the error rate is from the code sample with the best performance in Pass@$10$. The x-axis represents the percentage of the error rate intervals, and the y-axis represents the configurations. V.H. represents VerilogEval-human, and V.M. represents VerilogEval-machine.
  • Figure 5: Comparison of snippets of the information lists between two attempts of generation in a sequential logic task. The upper section is from the attempt where the codes passed the testbench, while the lower section is from the attempt where the codes failed during simulation.
  • ...and 7 more figures