Program Structure Aware Precondition Generation
Elizabeth Dinella, Shuvendu Lahiri, Mayur Naik
TL;DR
The paper tackles automatic inference of natural preconditions for code by transforming the target method rather than constructing predicates from scratch. It introduces a seed-based transformation pipeline that converts a method into a boolean-returning, non-exceptional precondition, iteratively refined through active test generation to guard against crashes and then reduced to remove irrelevant computation. Empirical results show preconditions inferred by this approach are comparably correct to state-of-the-art methods yet are significantly easier for humans to reason about and more effective as few-shot prompts for GPT-4; a large-scale dataset of approximately $\sim 18{,}000$ method–precondition pairs across $87$ projects demonstrates the framework’s scalability. The work contributes a new representation for preconditions, a practical toolchain, and a valuable real-world dataset that can drive future research in precondition inference and program verification. Overall, the approach advances the usability and applicability of precondition inference in software engineering tools and AI-assisted development.
Abstract
We introduce a novel approach for inferring natural preconditions from code. Our technique produces preconditions of high quality in terms of both correctness (modulo a test generator) and naturalness. Prior works generate preconditions from scratch through combinations of boolean predicates, but fall short in readability and ease of comprehension. Our innovation lies in, instead, leveraging the structure of a target method as a seed to infer a precondition through program transformations. Our evaluation shows that humans can more easily reason over preconditions inferred using our approach. Lastly, we instantiate our technique into a framework which can be applied at scale. We present a dataset of ~18k Java (method, precondition) pairs obtained by applying our framework to 87 real-world projects. We use this dataset to both evaluate our approach and draw useful insights for future research in precondition inference.
