Enhancing LLM-based Specification Generation via Program Slicing and Logical Deletion
Zehan Chen, Long Zhang, Zhiwei Zhang, JingJing Zhang, Ruoyu Zhou, Yulong Shen, JianFeng Ma, Lin Yang
TL;DR
SLD-Spec tackles the generality and precision gaps in LLM-based formal specification generation by introducing program slicing and a novel logical deletion stage. The four-phase pipeline—slice, guess, logical delete, verify—reduces contextual interference, preserves logically correct specifications, and leverages Frama-C for final verification. Empirical results on frama-c-problems and a newly constructed complex-func dataset show higher verification rates and better efficiency than AutoSpec and SpecGen, with ablations confirming the value of both core components. The work advances practical specification synthesis for programs with complex control flow and provides openly available data and tools to support further research.
Abstract
Traditional formal specification generation methods are typically tailored to specific specification types, and therefore suffer from limited generality. In recent years, large language model (LLM)-based specification generation approaches have emerged, offering a new direction for improving the universality of automated specification synthesis. However, when dealing with complex control flow, LLMs often struggle to precisely generate complete specifications that cover substructures. Moreover, the distinctive verification pipelines adopted by existing approaches may incorrectly discard logically correct specifications, while verification tools alone cannot reliably identify correct specifications. To address these issues, we propose SLD-Spec, an LLM-based specification generation method that combines program slicing and logical deletion. Specifically, SLD-Spec augments the conventional specification generation framework with two key stages: (1) a program slicing stage that decomposes the target function into several smaller code slices, enabling LLMs to focus on more localized semantic structures and thereby improving specification relevance and completeness; and (2) a logical deletion stage that leverages LLMs to perform logical reasoning and filtering over candidate specifications so as to retain logically correct ones. Experimental results show that SLD-Spec consistently outperforms existing methods on datasets containing programs of varying complexity, verifying more programs and generating specifications that are more relevant and more complete. Further ablation studies indicate that program slicing mainly improves specification relevance and completeness, whereas logical deletion plays a key role in increasing verification success rates.
