Towards LLM-based Generation of Human-Readable Proofs in Polynomial Formal Verification
Rolf Drechsler
TL;DR
This work addresses the automation gap in Polynomial Formal Verification (PFV) by exploring the use of Large Language Models (LLMs) to generate human-readable proofs that can be validated by formal reasoning engines. It leverages induction-based proof ideas and BDD-based reasoning, illustrating scenarios such as $f_2$, extensions to three variables, totally symmetric functions with $O(n^2)$ BDD size, and the multiplication case where BDD size is exponential regardless of ordering. The paper discusses coupling LLM outputs with formal tools, and outlines future directions including Retrieval-Augmented Generation (RAG) and pattern-based proof strategies to improve correctness and integration with PFV engines. Overall, it presents a practical pathway for producing scalable, understandable PFV proofs and provides reproducible resources (Appendix prompts) to support deployment and further research.
Abstract
Verification is one of the central tasks in circuit and system design. While simulation and emulation are widely used, complete correctness can only be ensured based on formal proof techniques. But these approaches often have very high run time and memory requirements. Recently, Polynomial Formal Verification (PFV) has been introduced showing that for many instances of practical relevance upper bounds on needed resources can be given. But proofs have to be provided that are human-readable. Here, we study how modern approaches from Artificial Intelligence (AI) based on Large Language Models (LLMs) can be used to generate proofs that later on can be validated based on reasoning engines. Examples are given that show how LLMs can interact with proof engines, and directions for future work are outlined.
