Table of Contents
Fetching ...

Neuro-Symbolic Generation and Validation of Memory-Aware Formal Function Specifications

Liao Zhang, Tong Chen, Xiwei Wu, Qi Liu, Xiyu Zhai, Xinqi Wang, Qinxiang Cao

Abstract

Formal verification of memory-manipulating programs critically depends on precise function specifications that capture memory states written by experts. This requirement has become a major bottleneck as large language models (LLMs) increasingly generate low-level systems code whose correctness cannot be assumed. To enable scalable formal verification, we focus exclusively on function specification generation, deliberately avoiding the synthesis of complex loop invariants that are central to traditional verification pipelines. We propose a neuro-symbolic framework for automatically generating memory-aware formal function specifications for C programs from natural language problem descriptions and function signatures. The pipeline first produces candidate specifications via in-context learning, and then iteratively refines them using compiler diagnostics from symbolic provers and the verification toolchain. In particular, we validate candidate specifications by constructing a proof for the negation of the specification with concrete examples, enabling machine-checked rejection of plausible-but-incorrect specifications. To support systematic evaluation, we introduce LeetCode-C-Spec, a new benchmark of 200 C programming problems for generating memory-aware formal function specifications. Experiments show that iterative refinement substantially improves syntactic validity, while symbolic prover-based refutation significantly enhances correctness assessment by filtering false positives that LLM-only judges frequently accept. Our results demonstrate that combining neural generation with symbolic feedback provides an effective approach to formal specification synthesis for memory-safe systems software.

Neuro-Symbolic Generation and Validation of Memory-Aware Formal Function Specifications

Abstract

Formal verification of memory-manipulating programs critically depends on precise function specifications that capture memory states written by experts. This requirement has become a major bottleneck as large language models (LLMs) increasingly generate low-level systems code whose correctness cannot be assumed. To enable scalable formal verification, we focus exclusively on function specification generation, deliberately avoiding the synthesis of complex loop invariants that are central to traditional verification pipelines. We propose a neuro-symbolic framework for automatically generating memory-aware formal function specifications for C programs from natural language problem descriptions and function signatures. The pipeline first produces candidate specifications via in-context learning, and then iteratively refines them using compiler diagnostics from symbolic provers and the verification toolchain. In particular, we validate candidate specifications by constructing a proof for the negation of the specification with concrete examples, enabling machine-checked rejection of plausible-but-incorrect specifications. To support systematic evaluation, we introduce LeetCode-C-Spec, a new benchmark of 200 C programming problems for generating memory-aware formal function specifications. Experiments show that iterative refinement substantially improves syntactic validity, while symbolic prover-based refutation significantly enhances correctness assessment by filtering false positives that LLM-only judges frequently accept. Our results demonstrate that combining neural generation with symbolic feedback provides an effective approach to formal specification synthesis for memory-safe systems software.
Paper Structure (10 sections, 2 equations, 4 figures, 1 table)

This paper contains 10 sections, 2 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: QCP formal verification for singly-linked list reversal with separation logic. The specification includes precondition Require sll(p, l) and postcondition Ensure sll(__return, rev(l)). The loop invariant maintains that the concatenation of reversed prefix and remaining suffix equals the original list. The separation operator ($*$) ensures disjoint heap regions for memory safety.
  • Figure 2: System architecture for LLM-based generation and refinement of memory-aware formal specifications. The pipeline consists of two main phases: (1) Specification Generation using LLM with in-context learning, and (2) Iterative Refinement guided by compiler feedback, symbolic execution, and counterexample-guided refutation to ensure both syntactic correctness.
  • Figure 3: Example format of a Coq refutation test case (tree insertion). The test binds concrete inputs, computes result, and asserts it differs from the expected output.
  • Figure 4: Counterexample-guided refutation workflow. Given a candidate specification, we negate the postcondition and attempt to find a test case satisfying the precondition but violating the postcondition. If such a counterexample exists, symbolic execution in Coq can concretely demonstrate the specification's incorrectness by exhibiting the violating heap state.