OriGen:Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection
Fan Cui, Chenyang Yin, Kexing Zhou, Youwei Xiao, Guangyu Sun, Qiang Xu, Qipeng Guo, Demin Song, Dahua Lin, Xingcheng Zhang, Yun, Liang
TL;DR
The paper presents OriGen, an open-source RTL code generation framework that combines code-to-code augmentation with a self-reflection loop to autonomously fix syntax errors using compiler feedback. It introduces two LoRA adapters (Gen LoRA and Fix LoRA) and leverages Claude3-Haiku as a teacher to generate high-quality descriptions and corrected RTL code, refined through an Icarus Verilog verifier. A dedicated VerilogFixEval benchmark assesses self-reflection by measuring syntactic and functional correction capabilities, while Ablation studies quantify the gains from augmentation and error-correction data. Experimental results show OriGen surpassing prior open-source RTL models and approaching GPT-4 Turbo in several metrics, with notable improvements in syntactic error correction and a demonstrated ability to learn from compiler feedback, highlighting the practicality of an open, privacy-preserving RTL design assistant.
Abstract
Recent studies have demonstrated the significant potential of Large Language Models (LLMs) in generating Register Transfer Level (RTL) code, with notable advancements showcased by commercial models such as GPT-4 and Claude3-Opus. However, these proprietary LLMs often raise concerns regarding privacy and security. While open-source LLMs offer solutions to these concerns, they typically underperform commercial models in RTL code generation tasks, primarily due to the scarcity of high-quality open-source RTL datasets. To address this challenge, we introduce OriGen , a fully open-source framework that incorporates self-reflection capabilities and a novel dataset augmentation methodology for generating high-quality, large-scale RTL code. Our approach employs a code-tocode augmentation technique to enhance the quality of open-source RTL code datasets. Furthermore, OriGen can rectify syntactic errors through a self-reflection process that leverages compiler feedback. Experimental results demonstrate that OriGen significantly outperforms other open-source alternatives in RTL code generation. It surpasses the previous best-performing open-source LLM by 12.8% and even exceeds GPT-4 Turbo in the pass@1 metric on the VerilogEval-Human benchmark. Moreover, OriGen exhibits superior capabilities in self-reflection and error correction, outperforming GPT-4 by 19.9% on a benchmark designed to evaluate self-reflection capabilities.
