Table of Contents
Fetching ...

Program Structure Aware Precondition Generation

Elizabeth Dinella, Shuvendu Lahiri, Mayur Naik

TL;DR

The paper tackles automatic inference of natural preconditions for code by transforming the target method rather than constructing predicates from scratch. It introduces a seed-based transformation pipeline that converts a method into a boolean-returning, non-exceptional precondition, iteratively refined through active test generation to guard against crashes and then reduced to remove irrelevant computation. Empirical results show preconditions inferred by this approach are comparably correct to state-of-the-art methods yet are significantly easier for humans to reason about and more effective as few-shot prompts for GPT-4; a large-scale dataset of approximately $\sim 18{,}000$ method–precondition pairs across $87$ projects demonstrates the framework’s scalability. The work contributes a new representation for preconditions, a practical toolchain, and a valuable real-world dataset that can drive future research in precondition inference and program verification. Overall, the approach advances the usability and applicability of precondition inference in software engineering tools and AI-assisted development.

Abstract

We introduce a novel approach for inferring natural preconditions from code. Our technique produces preconditions of high quality in terms of both correctness (modulo a test generator) and naturalness. Prior works generate preconditions from scratch through combinations of boolean predicates, but fall short in readability and ease of comprehension. Our innovation lies in, instead, leveraging the structure of a target method as a seed to infer a precondition through program transformations. Our evaluation shows that humans can more easily reason over preconditions inferred using our approach. Lastly, we instantiate our technique into a framework which can be applied at scale. We present a dataset of ~18k Java (method, precondition) pairs obtained by applying our framework to 87 real-world projects. We use this dataset to both evaluate our approach and draw useful insights for future research in precondition inference.

Program Structure Aware Precondition Generation

TL;DR

The paper tackles automatic inference of natural preconditions for code by transforming the target method rather than constructing predicates from scratch. It introduces a seed-based transformation pipeline that converts a method into a boolean-returning, non-exceptional precondition, iteratively refined through active test generation to guard against crashes and then reduced to remove irrelevant computation. Empirical results show preconditions inferred by this approach are comparably correct to state-of-the-art methods yet are significantly easier for humans to reason about and more effective as few-shot prompts for GPT-4; a large-scale dataset of approximately method–precondition pairs across projects demonstrates the framework’s scalability. The work contributes a new representation for preconditions, a practical toolchain, and a valuable real-world dataset that can drive future research in precondition inference and program verification. Overall, the approach advances the usability and applicability of precondition inference in software engineering tools and AI-assisted development.

Abstract

We introduce a novel approach for inferring natural preconditions from code. Our technique produces preconditions of high quality in terms of both correctness (modulo a test generator) and naturalness. Prior works generate preconditions from scratch through combinations of boolean predicates, but fall short in readability and ease of comprehension. Our innovation lies in, instead, leveraging the structure of a target method as a seed to infer a precondition through program transformations. Our evaluation shows that humans can more easily reason over preconditions inferred using our approach. Lastly, we instantiate our technique into a framework which can be applied at scale. We present a dataset of ~18k Java (method, precondition) pairs obtained by applying our framework to 87 real-world projects. We use this dataset to both evaluate our approach and draw useful insights for future research in precondition inference.
Paper Structure (36 sections, 10 figures, 4 tables)

This paper contains 36 sections, 10 figures, 4 tables.

Figures (10)

  • Figure 1: DivideAndRemainder method and its corresponding preconditions.
  • Figure 2: Overall architecture of our approach.
  • Figure 3: Illustration of seed generation.
  • Figure 4: Illustration of loop normalization in seed generation for precise check insertion. After applying the normalization (right), a nullness check at P1 would correctly guard against crashes.
  • Figure 5: Illustration of call normalization in seed generation for precise guarding against interprocedural crashes.
  • ...and 5 more figures