Table of Contents
Fetching ...

RTLFixer: Automatically Fixing RTL Syntax Errors with Large Language Models

Yun-Da Tsai, Mingjie Liu, Haoxing Ren

TL;DR

RTLFixer tackles the pervasive problem of syntax errors in LLM-generated Verilog by introducing a Retrieval-Augmented Generation and ReAct-based autonomous agent framework for interactive debugging. The approach leverages a non-parametric memory of compiler logs and expert guidance to guide iterative code revisions until syntactic correctness is achieved. On a VerilogEval-derived syntax dataset (VerilogEval-syntax), RTLFixer attains 98.5% syntax-error resolution and substantial pass@1 improvements, with demonstrated generalization to RTLLM benchmarks. The work contributes a practical debugging paradigm for RTL design and a dataset to enable future syntax-focused evaluation.

Abstract

This paper presents RTLFixer, a novel framework enabling automatic syntax errors fixing for Verilog code with Large Language Models (LLMs). Despite LLM's promising capabilities, our analysis indicates that approximately 55% of errors in LLM-generated Verilog are syntax-related, leading to compilation failures. To tackle this issue, we introduce a novel debugging framework that employs Retrieval-Augmented Generation (RAG) and ReAct prompting, enabling LLMs to act as autonomous agents in interactively debugging the code with feedback. This framework demonstrates exceptional proficiency in resolving syntax errors, successfully correcting about 98.5% of compilation errors in our debugging dataset, comprising 212 erroneous implementations derived from the VerilogEval benchmark. Our method leads to 32.3% and 10.1% increase in pass@1 success rates in the VerilogEval-Machine and VerilogEval-Human benchmarks, respectively.

RTLFixer: Automatically Fixing RTL Syntax Errors with Large Language Models

TL;DR

RTLFixer tackles the pervasive problem of syntax errors in LLM-generated Verilog by introducing a Retrieval-Augmented Generation and ReAct-based autonomous agent framework for interactive debugging. The approach leverages a non-parametric memory of compiler logs and expert guidance to guide iterative code revisions until syntactic correctness is achieved. On a VerilogEval-derived syntax dataset (VerilogEval-syntax), RTLFixer attains 98.5% syntax-error resolution and substantial pass@1 improvements, with demonstrated generalization to RTLLM benchmarks. The work contributes a practical debugging paradigm for RTL design and a dataset to enable future syntax-focused evaluation.

Abstract

This paper presents RTLFixer, a novel framework enabling automatic syntax errors fixing for Verilog code with Large Language Models (LLMs). Despite LLM's promising capabilities, our analysis indicates that approximately 55% of errors in LLM-generated Verilog are syntax-related, leading to compilation failures. To tackle this issue, we introduce a novel debugging framework that employs Retrieval-Augmented Generation (RAG) and ReAct prompting, enabling LLMs to act as autonomous agents in interactively debugging the code with feedback. This framework demonstrates exceptional proficiency in resolving syntax errors, successfully correcting about 98.5% of compilation errors in our debugging dataset, comprising 212 erroneous implementations derived from the VerilogEval benchmark. Our method leads to 32.3% and 10.1% increase in pass@1 success rates in the VerilogEval-Machine and VerilogEval-Human benchmarks, respectively.
Paper Structure (18 sections, 2 equations, 7 figures, 3 tables)

This paper contains 18 sections, 2 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Overview of RTLFixer. The Autonomous Language Agent fixes the syntax error via a feedback loop. ReAct handles the iterative code refinement with intermediate reasoning and action steps. Human expert guidance is incorporated through RAG.
  • Figure 2: Prompts used for ReAct. (a) shows the One-shot prompting template with feedback message. (b)-(c) demonstrate the example where LLMs serve as autonomous agents with ReAct to decompose syntax fixing problems with reasoning and planning.
  • Figure 3: Examples of common error categories that LLM constantly could not solve and the corresponding human expert guidance in the retrieval database.
  • Figure 4: VerilogEval pass@1 results prior (inner) and post (outer) syntax error fixing with RTLFixer.
  • Figure 5: Example of compiler log from iverilog and Quartus. Quartus feedback messages are more informative.
  • ...and 2 more figures