Formal that "Floats" High: Formal Verification of Floating Point Arithmetic
Hansa Mohanty, Vaisakh Naduvodi Viswambharan, Deepak Narayan Gadde
TL;DR
The paper addresses the difficulty of verifying floating-point arithmetic by proposing a direct RTL-to-RTL verification framework that compares RTL implementations against a cycle-accurate golden reference. It employs a hierarchical, property-driven workflow with CEX-guided refinement, fault injection, and agentic AI-based property generation guided by HITL to scale verification of datapath-intensive FP units. The FP adder is decomposed into Mantissa Alignment and Add-Round stages, with formal proofs of equivalence supported by two key lemmas; faults are systematically injected to test robustness. Across evaluations, RTL-to-RTL verification with AI-generated properties and HITL refinement achieves high coverage more efficiently than standalone methods, demonstrating a practical path toward scalable AI-assisted formal verification for floating-point datapaths.
Abstract
Formal verification of floating-point arithmetic remains challenging due to non-linear arithmetic behavior and the tight coupling between control and datapath logic. Existing approaches often rely on high-level C models for equivalence checking against Register Transfer Level (RTL) designs, but this introduces abstraction gaps, translation overhead, and limits scalability at the RTL level. To address these challenges, this paper presents a scalable methodology for verifying floating-point arithmetic using direct RTL-to-RTL model checking against a golden reference model. The approach adopts a divide-and conquer strategy that decomposes verification into modular stages, each captured by helper assertions and lemmas that collectively prove a main correctness theorem. Counterexample (CEX)-guided refinement is used to iteratively localize and resolve implementation defects, while targeted fault injection validates the robustness of the verification process against precision-critical datapath errors. To assess scalability and practicality, the methodology is extended with agentic AI-based formal property generation, integrating large language model (LLM)-driven automation with Human-in-the-Loop (HITL) refinement. Coverage analysis evaluates the effectiveness of the approach by comparing handwritten and AI-generated properties in both RTL-to-RTL model checking and standalone RTL verification settings. Results show that direct RTL-to-RTL model checking achieves higher coverage efficiency and requires fewer assertions than standalone verification, especially when combined with AI-generated properties refined through HITL guidance.
