MEIC: Re-thinking RTL Debug Automation using LLMs
Ke Xu, Jialin Sun, Yuchen Hu, Xinwei Fang, Weiwei Shan, Xi Wang, Zhe Jiang
TL;DR
MEIC reframes RTL debugging as an iterative, multi-agent process that leverages two specialized LLMs and a rollback-enabled repository to automatically identify and fix syntax and function errors in Verilog code. It integrates an RTL toolchain, testbenches, and simulations with error classification, domain-specific tuning, and a scorer to manage LLM uncertainty, accompanied by open-source tooling and a 178-instance Verilog error dataset. Empirical results show syntax and function fix rates of 93% and 78%, respectively, with up to 48x speedups compared to skilled engineers, demonstrating substantial practical impact for RTL debugging automation and reproducibility. The work advances RTL debugging by combining iterative LLM reasoning, domain knowledge, and robust governance mechanisms, offering a scalable path toward more reliable hardware verification workflows.
Abstract
The deployment of Large Language Models (LLMs) for code debugging (e.g., C and Python) is widespread, benefiting from their ability to understand and interpret intricate concepts. However, in the semiconductor industry, utilising LLMs to debug Register Transfer Level (RTL) code is still insufficient, largely due to the underrepresentation of RTL-specific data in training sets. This work introduces a novel framework, Make Each Iteration Count (MEIC), which contrasts with traditional one-shot LLM-based debugging methods that heavily rely on prompt engineering, model tuning, and model training. MEIC utilises LLMs in an iterative process to overcome the limitation of LLMs in RTL code debugging, which is suitable for identifying and correcting both syntax and function errors, while effectively managing the uncertainties inherent in LLM operations. To evaluate our framework, we provide an open-source dataset comprising 178 common RTL programming errors. The experimental results demonstrate that the proposed debugging framework achieves fix rate of 93% for syntax errors and 78% for function errors, with up to 48x speedup in debugging processes when compared with experienced engineers. The Repo. of dataset and code: https://anonymous.4open.science/r/Verilog-Auto-Debug-6E7F/.
