HDL-GPT: High-Quality HDL is All You Need
Bhuvnesh Kumar, Saurav Nanda, Ganapathy Parthasarathy, Pawan Patil, Austin Tsai, Parivesh Choudhary
TL;DR
HDL-GPT tackles the challenge of high-quality HDL code generation and analysis by leveraging a large, curated HDL data source augmented through a chain-of-thought prompting workflow. The approach couples automated HDL data augmentation with PEFT and Neftune fine-tuning to produce HDL-GPT variants that achieve state-of-the-art performance on HDL benchmarks for code generation, explanation, bug detection, and testbench creation. Across NYU and NVIDIA benchmarks, HDL-GPT and HDL-GPT2 consistently outperform baselines, with human-machine evaluations supporting strong QoR and robust zero-shot generalization. The work underscores the importance of data quality, prompt design, and targeted fine-tuning for advancing IC-design-oriented language models, and points to future integration within IC design flows and richer benchmarking.
Abstract
This paper presents Hardware Description Language Generative Pre-trained Transformers (HDL-GPT), a novel approach that leverages the vast repository of open-source High Definition Language (HDL) codes to train superior quality large code models. The core premise of this paper is the hypothesis that high-quality HDL is all you need to create models with exceptional performance and broad zero-shot generalization abilities. The paper elucidates the methods employed for the curation and augmentation of large corpora from open-source HDL code, transforming highly variable quality data into high-quality data through careful prompting and context maintenance. We demonstrate that the careful selection, filtering, and augmentation of data across HDLs can yield powerful models that surpass current state-of-the-art models. We also explore the impact of different fine-tuning methods on the quality of results. We describe experimental results across a range of fine-tuned SOTA LLMs, substantiating our claims. We demonstrate improvements of 50% to 200% over SOTA HDL models on current benchmarks in tasks ranging from HDL circuit explanations, code generation, formal and simulation testbench creation, triaging bugs, and fixing them. HDL-GPT opens new avenues for the development of advanced model training techniques for circuit design tasks.
