Table of Contents
Fetching ...

AutoBench: Automatic Testbench Generation and Evaluation Using LLMs for HDL Design

Ruidi Qiu, Grace Li Zhang, Rolf Drechsler, Ulf Schlichtmann, Bing Li

TL;DR

AutoBench introduces an end-to-end framework for automatic testbench generation using large language models, coupled with AutoEval for automated TB evaluation. By combining a forward-generation workflow with a self-enhancement loop (code completion, scenario checking, and auto debugging), it achieves significant gains in TB quality and coverage over a baseline that directly generates TBs with LLMs. The framework supports both combinational and sequential DUTs and demonstrates robust performance even on LLM-generated RTLs, highlighting practical impact for HDL verification. The open-source release enables replication and further development of LLM-assisted hardware verification tools and benchmarks.

Abstract

In digital circuit design, testbenches constitute the cornerstone of simulation-based hardware verification. Traditional methodologies for testbench generation during simulation-based hardware verification still remain partially manual, resulting in inefficiencies in testing various scenarios and requiring expensive time from designers. Large Language Models (LLMs) have demonstrated their potential in automating the circuit design flow. However, directly applying LLMs to generate testbenches suffers from a low pass rate. To address this challenge, we introduce AutoBench, the first LLM-based testbench generator for digital circuit design, which requires only the description of the design under test (DUT) to automatically generate comprehensive testbenches. In AutoBench, a hybrid testbench structure and a self-checking system are realized using LLMs. To validate the generated testbenches, we also introduce an automated testbench evaluation framework to evaluate the quality of generated testbenches from multiple perspectives. Experimental results demonstrate that AutoBench achieves a 57% improvement in the testbench pass@1 ratio compared with the baseline that directly generates testbenches using LLMs. For 75 sequential circuits, AutoBench successfully has a 3.36 times testbench pass@1 ratio compared with the baseline. The source codes and experimental results are open-sourced at this link: https://github.com/AutoBench/AutoBench

AutoBench: Automatic Testbench Generation and Evaluation Using LLMs for HDL Design

TL;DR

AutoBench introduces an end-to-end framework for automatic testbench generation using large language models, coupled with AutoEval for automated TB evaluation. By combining a forward-generation workflow with a self-enhancement loop (code completion, scenario checking, and auto debugging), it achieves significant gains in TB quality and coverage over a baseline that directly generates TBs with LLMs. The framework supports both combinational and sequential DUTs and demonstrates robust performance even on LLM-generated RTLs, highlighting practical impact for HDL verification. The open-source release enables replication and further development of LLM-assisted hardware verification tools and benchmarks.

Abstract

In digital circuit design, testbenches constitute the cornerstone of simulation-based hardware verification. Traditional methodologies for testbench generation during simulation-based hardware verification still remain partially manual, resulting in inefficiencies in testing various scenarios and requiring expensive time from designers. Large Language Models (LLMs) have demonstrated their potential in automating the circuit design flow. However, directly applying LLMs to generate testbenches suffers from a low pass rate. To address this challenge, we introduce AutoBench, the first LLM-based testbench generator for digital circuit design, which requires only the description of the design under test (DUT) to automatically generate comprehensive testbenches. In AutoBench, a hybrid testbench structure and a self-checking system are realized using LLMs. To validate the generated testbenches, we also introduce an automated testbench evaluation framework to evaluate the quality of generated testbenches from multiple perspectives. Experimental results demonstrate that AutoBench achieves a 57% improvement in the testbench pass@1 ratio compared with the baseline that directly generates testbenches using LLMs. For 75 sequential circuits, AutoBench successfully has a 3.36 times testbench pass@1 ratio compared with the baseline. The source codes and experimental results are open-sourced at this link: https://github.com/AutoBench/AutoBench
Paper Structure (41 sections, 1 equation, 13 figures, 2 tables)

This paper contains 41 sections, 1 equation, 13 figures, 2 tables.

Figures (13)

  • Figure 1: Outline of AutoBench workflow and AutoEval evaluation framework.
  • Figure 2: The driver and the checker in a TB, assuming the DUT has two input ports "a" and "b" and one output port "c".
  • Figure 3: AutoBench: The TB generation workflow in detail.
  • Figure 4: AutoEval: The evaluation framework in detail.
  • Figure 5: The distribution of Eval2 coverages among tasks passing Eval1
  • ...and 8 more figures