Table of Contents
Fetching ...

SpecLoop: An Agentic RTL-to-Specification Framework with Formal Verification Feedback Loop

Fu-Chieh Chang, Yu-Hsin Yang, Hung-Ming Huang, Yun-Chia Hsu, Yin-Yu Lin, Ming-Fang Tsai, Chun-Chih Yang, Pei-Yuan Wu

TL;DR

Experiments show that incorporating formal verification feedback substantially improves specification correctness and robustness over LLM-only baselines, demonstrating the effectiveness of verification-guided specification generation.

Abstract

RTL implementations frequently lack up-to-date or consistent specifications, making comprehension, maintenance, and verification costly and error-prone. While prior work has explored generating specifications from RTL using large language models (LLMs), ensuring that the generated documents faithfully capture design intent remains a major challenge. We present SpecLoop, an agentic framework for RTL-to-specification generation with a formal-verification-driven iterative feedback loop. SpecLoop first generates candidate specifications and then reconstructs RTL from these specifications; it uses formal equivalence checking tools between the reconstructed RTL and the original design to validate functional consistency. When mismatches are detected, counterexamples are fed back to iteratively refine the specifications until equivalence is proven or no further progress can be made. Experiments across multiple LLMs and RTL benchmarks show that incorporating formal verification feedback substantially improves specification correctness and robustness over LLM-only baselines, demonstrating the effectiveness of verification-guided specification generation.

SpecLoop: An Agentic RTL-to-Specification Framework with Formal Verification Feedback Loop

TL;DR

Experiments show that incorporating formal verification feedback substantially improves specification correctness and robustness over LLM-only baselines, demonstrating the effectiveness of verification-guided specification generation.

Abstract

RTL implementations frequently lack up-to-date or consistent specifications, making comprehension, maintenance, and verification costly and error-prone. While prior work has explored generating specifications from RTL using large language models (LLMs), ensuring that the generated documents faithfully capture design intent remains a major challenge. We present SpecLoop, an agentic framework for RTL-to-specification generation with a formal-verification-driven iterative feedback loop. SpecLoop first generates candidate specifications and then reconstructs RTL from these specifications; it uses formal equivalence checking tools between the reconstructed RTL and the original design to validate functional consistency. When mismatches are detected, counterexamples are fed back to iteratively refine the specifications until equivalence is proven or no further progress can be made. Experiments across multiple LLMs and RTL benchmarks show that incorporating formal verification feedback substantially improves specification correctness and robustness over LLM-only baselines, demonstrating the effectiveness of verification-guided specification generation.
Paper Structure (20 sections, 4 figures, 3 tables)

This paper contains 20 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: The architecture of SpecLoop and the spec verifier.
  • Figure 2: Multi-step prompt templates for the specification generator. (Left) First round: analyze RTL, write a structured specification, and self-check. (Right) Refinement: use verifier diagnostics (e.g., compiler errors or counterexamples) to edit only affected fields and keep the rest unchanged. Some line breaks are omitted due to page limits.
  • Figure 3: Ratio of verified specifications vs. unverified for different models and RR scores averaged over benchmarks and verifier variants. Verified specs show higher RR=1 proportions than unverified ones across models, especially for stronger models.
  • Figure 4: Qualitative Analysis. Selected text segments are highlighted in red for clarity. This example shows how SpecLoop fixes a spec error: the first-round spec wrongly states an asynchronous reset, reconstruction then fails equivalence checking, and the diagnosis guides the next round to revise the spec to a synchronous reset, after which the verifier passes.