Laurel: Unblocking Automated Verification with Large Language Models
Eric Mugnier, Emmanuel Anaya Gonzalez, Ranjit Jhala, Nadia Polikarpova, Yuanyuan Zhou
TL;DR
Laurel tackles the bottleneck of verification in large Dafny projects by automatically generating missing assertion hints using large language models. The method couples error-message–guided localization of the insertion point with in-context examples drawn from structurally similar proofs, selected by a novel hierarchical sequence similarity metric. On the DafnyGym dataset from three real-world codebases, Laurel achieves 56.6% success in generating necessary assertions within ten attempts, improving verification efficiency by up to a factor of 6–to–1 over naive prompts and leveraging context-specific demonstrations for additional gains. The work provides a practical, model-agnostic approach to unblocking automated verification and offers data and artifacts to extend evaluation to other SMT-based verifiers.
Abstract
Program verifiers such as Dafny automate proofs by outsourcing them to an SMT solver. This automation is not perfect, however, and the solver often requires hints in the form of assertions, creating a burden for the proof engineer. In this paper, we propose Laurel, a tool that alleviates this burden by automatically generating assertions using large language models (LLMs). To improve the success rate of LLMs in this task, we design two domain-specific prompting techniques. First, we help the LLM determine the location of the missing assertion by analyzing the verifier's error message and inserting an assertion placeholder at that location. Second, we provide the LLM with example assertions from the same codebase, which we select based on a new proof similarity metric. We evaluate our techniques on our new benchmark DafnyGym, a dataset of complex lemmas we extracted from three real-world Dafny codebases. Our evaluation shows that Laurel is able to generate over 56.6\% of the required assertions given only a few attempts, making LLMs an affordable tool for unblocking program verifiers without human intervention.
