HDLCoRe: A Training-Free Framework for Mitigating Hallucinations in LLM-Generated HDL
Heng Ping, Shixuan Li, Peiyu Zhang, Anzhe Cheng, Shukai Duan, Nikos Kanakaris, Xiongye Xiao, Wei Yang, Shahin Nazarian, Andrei Irimia, Paul Bogdan
TL;DR
HDLCoRe tackles HDL code generation hallucinations in LLMs by offering a training-free framework that blends HDL-aware Chain-of-Thought prompting with self-verification and a two-stage heterogeneous RAG system. It classifies HDL tasks by type and complexity, augments prompts with domain knowledge, generates and self-validates testbenches, and retrieves relevant HDL exemplars from a curated open-source database to guide generation. On RTLLM 2.0, HDLCoRe yields notable improvements in functional correctness, with particularly large gains for smaller models, and shows robustness across model scales and HDL task categories. This approach provides a practical, data-efficient path to higher-quality HDL generation without fine-tuning or external tooling, potentially accelerating hardware design workflows and enabling broader accessibility of HDL code generation.
Abstract
Recent advances in large language models (LLMs) have demonstrated remarkable capabilities in code generation tasks. However, when applied to hardware description languages (HDL), these models exhibit significant limitations due to data scarcity, resulting in hallucinations and incorrect code generation. To address these challenges, we propose HDLCoRe, a training-free framework that enhances LLMs' HDL generation capabilities through prompt engineering techniques and retrieval-augmented generation (RAG). Our approach consists of two main components: (1) an HDL-aware Chain-of-Thought (CoT) prompting technique with self-verification that classifies tasks by complexity and type, incorporates domain-specific knowledge, and guides LLMs through step-by-step self-simulation for error correction; and (2) a two-stage heterogeneous RAG system that addresses formatting inconsistencies through key component extraction and efficiently retrieves relevant HDL examples through sequential filtering and re-ranking. HDLCoRe eliminates the need for model fine-tuning while substantially improving LLMs' HDL generation capabilities. Experimental results demonstrate that our framework achieves superior performance on the RTLLM2.0 benchmark, significantly reducing hallucinations and improving both syntactic and functional correctness.
