Are LLMs Ready for Practical Adoption for Assertion Generation?

Vaishnavi Pulavarthi; Deeksha Nandal; Soham Dan; Debjit Pal

Are LLMs Ready for Practical Adoption for Assertion Generation?

Vaishnavi Pulavarthi, Deeksha Nandal, Soham Dan, Debjit Pal

TL;DR

The paper tackles the readiness of LLMs for practical hardware assertion generation by introducing AssertionBench, a 100-design Verilog benchmark, and proposing AssertionLLM, a fine-tuned LLM for improved assertion correctness. It formalizes assertions as $P = \mathcal{G}(A \\rightarrow C)$ with $A = \\bigwedge_{i=0}^m \\mathcal{X}^i(A^i)$ and $C = \\mathcal{X}^n(C^n)$, $n \\ge m$, and demonstrates that COTS LLMs produce many syntactic/semantic errors. Finetuning (from CodeLLaMa 2 and LLaMa3-70B) significantly boosts correctness (e.g., CodeLLaMa 2 shows up to +29% and +38% proven assertions and substantial CEX reductions), though syntactic errors persist, underscoring the need for richer HDL understanding and auxiliary design representations. The work provides a clear path toward practical AI-assisted assertion generation and outlines key directions for extending coverage, exploiting design artifacts, and addressing security considerations to accelerate industrial adoption.

Abstract

Assertions have been the de facto collateral for simulation-based and formal verification of hardware designs for over a decade. The quality of hardware verification, i.e., detection and diagnosis of corner-case design bugs, is critically dependent on the quality of the assertions. With the onset of generative AI such as Transformers and Large-Language Models (LLMs), there has been a renewed interest in developing novel, effective, and scalable techniques of generating functional and security assertions from design source code. While there have been recent works that use commercial-of-the-shelf (COTS) LLMs for assertion generation, there is no comprehensive study in quantifying the effectiveness of LLMs in generating syntactically and semantically correct assertions. In this paper, we first discuss AssertionBench from our prior work, a comprehensive set of designs and assertions to quantify the goodness of a broad spectrum of COTS LLMs for the task of assertion generations from hardware design source code. Our key insight was that COTS LLMs are not yet ready for prime-time adoption for assertion generation as they generate a considerable fraction of syntactically and semantically incorrect assertions. Motivated by the insight, we propose AssertionLLM, a first of its kind LLM model, specifically fine-tuned for assertion generation. Our initial experimental results show that AssertionLLM considerably improves the semantic and syntactic correctness of the generated assertions over COTS LLMs.

Are LLMs Ready for Practical Adoption for Assertion Generation?

TL;DR

Abstract

Are LLMs Ready for Practical Adoption for Assertion Generation?

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)