Towards Automated Verification of LLM-Synthesized C Programs
Prasita Mukherjee, Benjamin Delaware
TL;DR
SynVer tackles the challenge of producing high-assurance C programs by marrying LLM-driven synthesis with formal verification in the VST/Rocq framework. It imposes syntactic and semantic biases during generation to produce code that is amenable to automated verification, then uses a two-phase process where GenProof and SepAuto, augmented by an LLM prover, attempt to discharge proof obligations against rich separation-logic specifications. The evaluation demonstrates that a substantial portion of benchmarks can be automatically verified, with insights into prompt design and proof automation strategies; remaining cases reveal both the strengths and current limitations of LLM-assisted verification in this setting. The work significantly advances automated, machine-checked synthesis for heap-manipulating C programs and lays groundwork for broader domain coverage and more robust proof automation. Practically, SynVer offers a pathway to scalable, high-assurance C program generation by leveraging existing interactive verification infrastructure augmented with modern LLMs.
Abstract
We present \synver{}, a novel synthesis and verification framework for C programs, that deploys a Large Language Model (LLM) to search for a candidate program that satisfies the given specification. Our key idea is to impose syntactic and semantic biases on programs generated by LLMs, such that the synthesized program is more amenable to automated verification. Based on this idea, we propose a novel specification-verification tool, built on top of Verified Software Toolchain, that help automate the process. Our experiments on a diverse set of benchmarks drawn from the deductive program synthesis community, shows that this approach is scalable and extensible. The benchmarks constitute of specifications comprising of basic coding examples, Separation Logic based assertions, and API specifications.
