HaVen: Hallucination-Mitigated LLM for Verilog Code Generation Aligned with HDL Engineers
Yiyao Yang, Fu Teng, Pengju Liu, Mengnan Qi, Chenyang Lv, Ji Li, Xuhong Zhang, Zhezhi He
TL;DR
HaVen tackles hallucinations in Verilog code generation by introducing a Verilog-specific taxonomy and a three-stage framework that combines symbolic interpretation via SI-CoT with knowledge- and logic-enhanced data augmentation. The method aligns LLM outputs with HDL engineer practices and improves functional correctness across VerilogEval and RTLLM benchmarks, outperforming baselines with notable gains from both SI-CoT and KL-dataset components. The authors demonstrate robust handling of symbolic modalities (truth tables, waveforms, state diagrams) and provide ablations showing additive benefits from the proposed data and prompting strategies. Overall, HaVen offers a practical pathway to reliable HDL code generation with publicly available tooling and datasets, advancing HDL automation via LLMs.
Abstract
Recently, the use of large language models (LLMs) for Verilog code generation has attracted great research interest to enable hardware design automation. However, previous works have shown a gap between the ability of LLMs and the practical demands of hardware description language (HDL) engineering. This gap includes differences in how engineers phrase questions and hallucinations in the code generated. To address these challenges, we introduce HaVen, a novel LLM framework designed to mitigate hallucinations and align Verilog code generation with the practices of HDL engineers. HaVen tackles hallucination issues by proposing a comprehensive taxonomy and employing a chain-of-thought (CoT) mechanism to translate symbolic modalities (e.g. truth tables, state diagrams, etc.) into accurate natural language descriptions. Furthermore, HaVen bridges this gap by using a data augmentation strategy. It synthesizes high-quality instruction-code pairs that match real HDL engineering practices. Our experiments demonstrate that HaVen significantly improves the correctness of Verilog code generation, outperforming state-of-the-art LLM-based Verilog generation methods on VerilogEval and RTLLM benchmark. HaVen is publicly available at https://github.com/Intelligent-Computing-Research-Group/HaVen.
