Table of Contents
Fetching ...

Saarthi: The First AI Formal Verification Engineer

Aman Kumar, Deepak Narayan Gadde, Keerthan Kopparam Radhakrishna, Djones Lettnin

TL;DR

The paper addresses the challenge of achieving fully autonomous end-to-end formal verification for RTL designs by introducing Saarthi, an agentic AI-based verification engine. Saarthi orchestrates multiple AI agents to generate a verification vPlan, create SystemVerilog Assertions (SVA), prove properties, analyze counterexamples, and assess formal coverage, all within a human-guided, hyper-structured workflow that includes a human-in-the-loop (HIL) mechanism to prevent looping. Benchmark results show that the approach yields meaningful end-to-end verification in a substantial fraction of runs, with GPT-4o offering the most reliable performance in terms of coverage and consistency, though overall end-to-end success stays around the 40% mark and depends on the LLM. The work demonstrates a scalable, domain-agnostic framework that can extend to other verification tasks such as UVM testbenches and beyond, highlighting a path toward more autonomous, AI-assisted hardware verification environments.

Abstract

Recently, Devin has made a significant buzz in the Artificial Intelligence (AI) community as the world's first fully autonomous AI software engineer, capable of independently developing software code. Devin uses the concept of agentic workflow in Generative AI (GenAI), which empowers AI agents to engage in a more dynamic, iterative, and self-reflective process. In this paper, we present a similar fully autonomous AI formal verification engineer, Saarthi, capable of verifying a given RTL design end-to-end using an agentic workflow. With Saarthi, verification engineers can focus on more complex problems, and verification teams can strive for more ambitious goals. The domain-agnostic implementation of Saarthi makes it scalable for use across various domains such as RTL design, UVM-based verification, and others.

Saarthi: The First AI Formal Verification Engineer

TL;DR

The paper addresses the challenge of achieving fully autonomous end-to-end formal verification for RTL designs by introducing Saarthi, an agentic AI-based verification engine. Saarthi orchestrates multiple AI agents to generate a verification vPlan, create SystemVerilog Assertions (SVA), prove properties, analyze counterexamples, and assess formal coverage, all within a human-guided, hyper-structured workflow that includes a human-in-the-loop (HIL) mechanism to prevent looping. Benchmark results show that the approach yields meaningful end-to-end verification in a substantial fraction of runs, with GPT-4o offering the most reliable performance in terms of coverage and consistency, though overall end-to-end success stays around the 40% mark and depends on the LLM. The work demonstrates a scalable, domain-agnostic framework that can extend to other verification tasks such as UVM testbenches and beyond, highlighting a path toward more autonomous, AI-assisted hardware verification environments.

Abstract

Recently, Devin has made a significant buzz in the Artificial Intelligence (AI) community as the world's first fully autonomous AI software engineer, capable of independently developing software code. Devin uses the concept of agentic workflow in Generative AI (GenAI), which empowers AI agents to engage in a more dynamic, iterative, and self-reflective process. In this paper, we present a similar fully autonomous AI formal verification engineer, Saarthi, capable of verifying a given RTL design end-to-end using an agentic workflow. With Saarthi, verification engineers can focus on more complex problems, and verification teams can strive for more ambitious goals. The domain-agnostic implementation of Saarthi makes it scalable for use across various domains such as RTL design, UVM-based verification, and others.

Paper Structure

This paper contains 13 sections, 13 figures, 6 tables, 1 algorithm.

Figures (13)

  • Figure 1: Non-agentic workflow
  • Figure 2: Agentic workflow
  • Figure 3: Coder and critic AI agents for self-reflection (adapted from youtube)
  • Figure 4: Iterative feedback with self-refinement self_refine
  • Figure 5: Example of self-refinement: an initial output is generated by the base LLM and then passed back to the same LLM to receive feedback and sent to the same LLM to refine the output.
  • ...and 8 more figures