Table of Contents
Fetching ...

Integrating Symbolic Execution with LLMs for Automated Generation of Program Specifications

Fanpeng Yang, Xu Ma, Shuling Wang, Xiong Xu, Qinxiang Cao, Naijun Zhan, Xiaofeng Li, Bin Gu

TL;DR

This work tackles automatic generation of formal specifications (preconditions, postconditions, and loop invariants) for legacy C programs by integrating symbolic execution with large language models (LLMs) and formal verification. It introduces SESpec, a four-stage pipeline that uses a flattened memory model, strongest postconditions for loop-free code, template-guided loop invariants, and verification-driven refinement to produce precise specifications without over-reliance on explicit verification goals. Key contributions include a compositional FuncSpec framework, a template-based LoopInvGen with LoopAnalysis, and a tight execution–verification loop backed by QCP and Frama-C, along with calibration prompts for inductive predicates. Empirical results across diverse benchmarks show SESpec achieving state-of-the-art performance in invariant and specification generation, robust to goal masking, and capable of handling complex data structures like structs, pointers, and inductive types. The work demonstrates strong practical impact for automated specification generation in industrial codebases, providing a scalable pathway toward formal verification of legacy software.

Abstract

Automatically generating formal specifications including loop invariants, preconditions, and postconditions for legacy code is critical for program understanding, reuse and verification. However, the inherent complexity of control and data structures in programs makes this task particularly challenging. This paper presents a novel framework that integrates symbolic execution with large language models (LLMs) to automatically synthesize formally verified program specifications. Our method first employs symbolic execution to derive precise strongest postconditions for loop-free code segments. These symbolic execution results, along with automatically generated invariant templates, then guide the LLM to propose and iteratively refine loop invariants until a correct specification is obtained. The template-guided generation process robustly combines symbolic inference with LLM reasoning, significantly reducing hallucinations and syntactic errors by structurally constraining the LLM's output space. Furthermore, our approach can produce strong specifications without relying on externally provided verification goals, enabled by the rich semantic context supplied by symbolic execution, overcoming a key limitation of prior goal-dependent tools. Extensive evaluation shows that our tool SESpec outperforms the existing state-of-the-art tools across numerical and data-structure benchmarks, demonstrating both high precision and broad applicability.

Integrating Symbolic Execution with LLMs for Automated Generation of Program Specifications

TL;DR

This work tackles automatic generation of formal specifications (preconditions, postconditions, and loop invariants) for legacy C programs by integrating symbolic execution with large language models (LLMs) and formal verification. It introduces SESpec, a four-stage pipeline that uses a flattened memory model, strongest postconditions for loop-free code, template-guided loop invariants, and verification-driven refinement to produce precise specifications without over-reliance on explicit verification goals. Key contributions include a compositional FuncSpec framework, a template-based LoopInvGen with LoopAnalysis, and a tight execution–verification loop backed by QCP and Frama-C, along with calibration prompts for inductive predicates. Empirical results across diverse benchmarks show SESpec achieving state-of-the-art performance in invariant and specification generation, robust to goal masking, and capable of handling complex data structures like structs, pointers, and inductive types. The work demonstrates strong practical impact for automated specification generation in industrial codebases, providing a scalable pathway toward formal verification of legacy software.

Abstract

Automatically generating formal specifications including loop invariants, preconditions, and postconditions for legacy code is critical for program understanding, reuse and verification. However, the inherent complexity of control and data structures in programs makes this task particularly challenging. This paper presents a novel framework that integrates symbolic execution with large language models (LLMs) to automatically synthesize formally verified program specifications. Our method first employs symbolic execution to derive precise strongest postconditions for loop-free code segments. These symbolic execution results, along with automatically generated invariant templates, then guide the LLM to propose and iteratively refine loop invariants until a correct specification is obtained. The template-guided generation process robustly combines symbolic inference with LLM reasoning, significantly reducing hallucinations and syntactic errors by structurally constraining the LLM's output space. Furthermore, our approach can produce strong specifications without relying on externally provided verification goals, enabled by the rich semantic context supplied by symbolic execution, overcoming a key limitation of prior goal-dependent tools. Extensive evaluation shows that our tool SESpec outperforms the existing state-of-the-art tools across numerical and data-structure benchmarks, demonstrating both high precision and broad applicability.

Paper Structure

This paper contains 37 sections, 2 equations, 6 figures, 3 tables, 4 algorithms.

Figures (6)

  • Figure 1: Motivating Example
  • Figure 2: Overview of Our Approach
  • Figure 3: The Implementation Design Framework
  • Figure 4: The Prompt Design
  • Figure 5: Ablation Study. Blue cells: loop invariant generation. Red cells: program specification generation.
  • ...and 1 more figures

Theorems & Definitions (1)

  • Definition 1