VeriThoughts: Enabling Automated Verilog Code Generation using Reasoning and Formal Verification
Patrick Yubeaton, Andre Nakkab, Weihua Xiao, Luca Collini, Ramesh Karri, Chinmay Hegde, Siddharth Garg
TL;DR
VeriThoughts tackles HDL data scarcity by introducing a large-scale, reasoning-enabled Verilog dataset with prompts, reasoning traces, and formal-verification labels. The authors build a four-way pipeline connecting ground-truth $V$, prompt $Q$, reasoning $R$, and generated $V^{*}$, with a self-consistency label $L_c$ obtained via Yosys-based equivalence checks. They demonstrate that reasoning traces and self-consistency can improve Verilog generation, achieving state-of-the-art results on VerilogEval with open-source models fine-tuned on VeriThoughts. This work illustrates the benefits of combining prompting, reasoning, and formal verification to produce verifiably correct HDL code and sets a path for expanding HDL data-generation tools.
Abstract
This paper introduces VeriThoughts, a novel dataset designed for reasoning-based Verilog code generation. We establish a new benchmark framework grounded in formal verification methods to evaluate the quality and correctness of generated hardware descriptions. Additionally, we present a suite of specialized small-scale models optimized specifically for Verilog generation. Our work addresses the growing need for automated hardware design tools that can produce verifiably correct implementations from high-level specifications, potentially accelerating the hardware development process while maintaining rigorous correctness guarantees. Our code and data are available at \href{https://github.com/wilyub/VeriThoughts}{this URL}.
