Table of Contents
Fetching ...

R3A: Reliable RTL Repair Framework with Multi-Agent Fault Localization and Stochastic Tree-of-Thoughts Patch Generation

Zizhang Luo, Fan Cui, Kexing Zhou, Runlin Guo, Mile Xia, Hongyuan Hou, Yun Liang

TL;DR

This paper tackles reliable RTL repair by addressing two key challenges in LLM-based methods: stochasticity and long, verbose RTL contexts. It introduces R3A, a framework integrating a stochastic Tree-of-Thoughts patch-generation agent with a multi-agent fault localization module, all coordinated through an Agent-Debugger Interface that links code, waveforms, and EDA tools. Empirical results on the RTL-repair benchmark show superior reliability, fixing 90.6% of bugs and achieving an average pass@5 of 86.7%, with a 45% improvement over prior methods. The work demonstrates that structured exploration and distributed fault analysis can drastically improve the practicality of LLM-driven RTL debugging in hardware design.

Abstract

Repairing RTL bugs is crucial for hardware design and verification. Traditional automatic program repair (APR) methods define dedicated search spaces to locate and fix bugs with program synthesis. However, they heavily rely on fixed templates and can only deal with limited bugs. As an alternative, Large Language Models with the ability to understand code semantics can be explored for RTL repair. However, they suffer from unreliable outcomes due to inherent randomness and long input contexts of RTL code and waveform. To address these challenges, we propose R3A, an LLM-based automatic RTL program repair framework upon the basic model to improve reliability. R3A proposes the stochastic Tree-Of-Thoughts method to control a patch generation agent to explore a validated solution for the bug. The algorithm samples search states according to a heuristic function to balance between exploration and exploitation for a reliable outcome. Besides, R3A proposes a multi-agent fault localization method to find fault candidates as the starting points for the patch generation agent, further increasing the reliability. Experiments show R3A can fix 90.6% of bugs in the RTL-repair dataset within a given time limit, which covers 45% more bugs than traditional methods and other LLM-based approaches, while achieving an 86.7% pass@5 rate on average, showing a high reliability.

R3A: Reliable RTL Repair Framework with Multi-Agent Fault Localization and Stochastic Tree-of-Thoughts Patch Generation

TL;DR

This paper tackles reliable RTL repair by addressing two key challenges in LLM-based methods: stochasticity and long, verbose RTL contexts. It introduces R3A, a framework integrating a stochastic Tree-of-Thoughts patch-generation agent with a multi-agent fault localization module, all coordinated through an Agent-Debugger Interface that links code, waveforms, and EDA tools. Empirical results on the RTL-repair benchmark show superior reliability, fixing 90.6% of bugs and achieving an average pass@5 of 86.7%, with a 45% improvement over prior methods. The work demonstrates that structured exploration and distributed fault analysis can drastically improve the practicality of LLM-driven RTL debugging in hardware design.

Abstract

Repairing RTL bugs is crucial for hardware design and verification. Traditional automatic program repair (APR) methods define dedicated search spaces to locate and fix bugs with program synthesis. However, they heavily rely on fixed templates and can only deal with limited bugs. As an alternative, Large Language Models with the ability to understand code semantics can be explored for RTL repair. However, they suffer from unreliable outcomes due to inherent randomness and long input contexts of RTL code and waveform. To address these challenges, we propose R3A, an LLM-based automatic RTL program repair framework upon the basic model to improve reliability. R3A proposes the stochastic Tree-Of-Thoughts method to control a patch generation agent to explore a validated solution for the bug. The algorithm samples search states according to a heuristic function to balance between exploration and exploitation for a reliable outcome. Besides, R3A proposes a multi-agent fault localization method to find fault candidates as the starting points for the patch generation agent, further increasing the reliability. Experiments show R3A can fix 90.6% of bugs in the RTL-repair dataset within a given time limit, which covers 45% more bugs than traditional methods and other LLM-based approaches, while achieving an 86.7% pass@5 rate on average, showing a high reliability.

Paper Structure

This paper contains 15 sections, 1 equation, 5 figures, 3 tables, 1 algorithm.

Figures (5)

  • Figure 1: Overview of the R3A Framework
  • Figure 2: Search Problem Definition and the Search Tree
  • Figure 3: Detailed workflow of the Multi-Agent Anomaly Detection for Fault Localization
  • Figure 4: Comparison of pass@1 rate between different versions. The figure is a combined figure of a histogram with Kernel Density Estimation (KDE) lines. If the line is higher on the right side, the method is generally more reliable.
  • Figure 5: Comparison of the time and token usage between different versions. Each point is one test. The unit for tokens is millions, and the unit for time is seconds. The red line marks the time and token budget range, where a point fails if falling out of range.