Table of Contents
Fetching ...

LLM-Aided Testbench Generation and Bug Detection for Finite-State Machines

Jitendra Bhandari, Johann Knechtel, Ramesh Narayanaswamy, Siddharth Garg, Ramesh Karri

TL;DR

This work investigates using LLMs (GPT3.5 and GPT4) to automate FSM testbench generation and RTL bug detection, guided by feedback from commercial EDA tools to improve transition coverage. The authors propose a two-pronged methodology: (i) coverage-guided testbench generation that iteratively refines LLM outputs using coverage reports, and (ii) LLM-guided bug detection that flags mismatches between simulation traces and natural-language specifications. Evaluated on 100 FSMs from HDLBits and GitHub, the framework demonstrates substantial gains in coverage and effective bug detection, with GPT4 showing stronger performance and fewer iterations than GPT3.5 in many cases. However, scalability to large FSMs remains challenging, motivating prompt-engineering strategies (e.g., resetting, IO-pattern subdivision) and the continued integration of EDA feedback for practical hardware verification workflows.

Abstract

This work investigates the potential of tailoring Large Language Models (LLMs), specifically GPT3.5 and GPT4, for the domain of chip testing. A key aspect of chip design is functional testing, which relies on testbenches to evaluate the functionality and coverage of Register-Transfer Level (RTL) designs. We aim to enhance testbench generation by incorporating feedback from commercial-grade Electronic Design Automation (EDA) tools into LLMs. Through iterative feedback from these tools, we refine the testbenches to achieve improved test coverage. Our case studies present promising results, demonstrating that this approach can effectively enhance test coverage. By integrating EDA tool feedback, the generated testbenches become more accurate in identifying potential issues in the RTL design. Furthermore, we extended our study to use this enhanced test coverage framework for detecting bugs in the RTL implementations

LLM-Aided Testbench Generation and Bug Detection for Finite-State Machines

TL;DR

This work investigates using LLMs (GPT3.5 and GPT4) to automate FSM testbench generation and RTL bug detection, guided by feedback from commercial EDA tools to improve transition coverage. The authors propose a two-pronged methodology: (i) coverage-guided testbench generation that iteratively refines LLM outputs using coverage reports, and (ii) LLM-guided bug detection that flags mismatches between simulation traces and natural-language specifications. Evaluated on 100 FSMs from HDLBits and GitHub, the framework demonstrates substantial gains in coverage and effective bug detection, with GPT4 showing stronger performance and fewer iterations than GPT3.5 in many cases. However, scalability to large FSMs remains challenging, motivating prompt-engineering strategies (e.g., resetting, IO-pattern subdivision) and the continued integration of EDA feedback for practical hardware verification workflows.

Abstract

This work investigates the potential of tailoring Large Language Models (LLMs), specifically GPT3.5 and GPT4, for the domain of chip testing. A key aspect of chip design is functional testing, which relies on testbenches to evaluate the functionality and coverage of Register-Transfer Level (RTL) designs. We aim to enhance testbench generation by incorporating feedback from commercial-grade Electronic Design Automation (EDA) tools into LLMs. Through iterative feedback from these tools, we refine the testbenches to achieve improved test coverage. Our case studies present promising results, demonstrating that this approach can effectively enhance test coverage. By integrating EDA tool feedback, the generated testbenches become more accurate in identifying potential issues in the RTL design. Furthermore, we extended our study to use this enhanced test coverage framework for detecting bugs in the RTL implementations

Paper Structure

This paper contains 17 sections, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Contribution 1: LLM-aided Testbench Generation.
  • Figure 2: Contribution 2: LLM-aided Bug Detection.
  • Figure 3: Exemplary prompts used with GPT.
  • Figure 4: Additional Prompt for GPT during detection step.
  • Figure 5: An LLM-generated testbench exercising the design under test, in this instance an FSM.
  • ...and 3 more figures