Table of Contents
Fetching ...

Towards Automated Verification of LLM-Synthesized C Programs

Prasita Mukherjee, Benjamin Delaware

TL;DR

SynVer tackles the challenge of producing high-assurance C programs by marrying LLM-driven synthesis with formal verification in the VST/Rocq framework. It imposes syntactic and semantic biases during generation to produce code that is amenable to automated verification, then uses a two-phase process where GenProof and SepAuto, augmented by an LLM prover, attempt to discharge proof obligations against rich separation-logic specifications. The evaluation demonstrates that a substantial portion of benchmarks can be automatically verified, with insights into prompt design and proof automation strategies; remaining cases reveal both the strengths and current limitations of LLM-assisted verification in this setting. The work significantly advances automated, machine-checked synthesis for heap-manipulating C programs and lays groundwork for broader domain coverage and more robust proof automation. Practically, SynVer offers a pathway to scalable, high-assurance C program generation by leveraging existing interactive verification infrastructure augmented with modern LLMs.

Abstract

We present \synver{}, a novel synthesis and verification framework for C programs, that deploys a Large Language Model (LLM) to search for a candidate program that satisfies the given specification. Our key idea is to impose syntactic and semantic biases on programs generated by LLMs, such that the synthesized program is more amenable to automated verification. Based on this idea, we propose a novel specification-verification tool, built on top of Verified Software Toolchain, that help automate the process. Our experiments on a diverse set of benchmarks drawn from the deductive program synthesis community, shows that this approach is scalable and extensible. The benchmarks constitute of specifications comprising of basic coding examples, Separation Logic based assertions, and API specifications.

Towards Automated Verification of LLM-Synthesized C Programs

TL;DR

SynVer tackles the challenge of producing high-assurance C programs by marrying LLM-driven synthesis with formal verification in the VST/Rocq framework. It imposes syntactic and semantic biases during generation to produce code that is amenable to automated verification, then uses a two-phase process where GenProof and SepAuto, augmented by an LLM prover, attempt to discharge proof obligations against rich separation-logic specifications. The evaluation demonstrates that a substantial portion of benchmarks can be automatically verified, with insights into prompt design and proof automation strategies; remaining cases reveal both the strengths and current limitations of LLM-assisted verification in this setting. The work significantly advances automated, machine-checked synthesis for heap-manipulating C programs and lays groundwork for broader domain coverage and more robust proof automation. Practically, SynVer offers a pathway to scalable, high-assurance C program generation by leveraging existing interactive verification infrastructure augmented with modern LLMs.

Abstract

We present \synver{}, a novel synthesis and verification framework for C programs, that deploys a Large Language Model (LLM) to search for a candidate program that satisfies the given specification. Our key idea is to impose syntactic and semantic biases on programs generated by LLMs, such that the synthesized program is more amenable to automated verification. Based on this idea, we propose a novel specification-verification tool, built on top of Verified Software Toolchain, that help automate the process. Our experiments on a diverse set of benchmarks drawn from the deductive program synthesis community, shows that this approach is scalable and extensible. The benchmarks constitute of specifications comprising of basic coding examples, Separation Logic based assertions, and API specifications.

Paper Structure

This paper contains 26 sections, 2 equations, 9 figures, 5 tables, 2 algorithms.

Figures (9)

  • Figure 1: C function that concatenates two singly-linked lists
  • Figure 2: A partial proof of correctness of in VST
  • Figure 3: Overview of our approach
  • Figure 4: The initial prompt template for the coder LLM
  • Figure 5: Templates used to re-prompt the coder LLM
  • ...and 4 more figures