FormalSpecCpp: A Dataset of C++ Formal Specifications created using LLMs
Madhurima Chakraborty, Peter Pirkelbauer, Qing Yi
TL;DR
FormalSpecCpp addresses the lack of standardized benchmarks for C++ formal specifications by creating a ground-truth dataset of C++ programs with preconditions and postconditions, derived from verified Dafny code. The authors employ a prompt-driven translation pipeline using GPT-4-turbo to convert Dafny specifications into C++, complemented by automated and manual verification of translated tests and code. The dataset enables rigorous evaluation of specification inference tools and supports AI-assisted software verification and testing, with demonstrated efficiency and transparent cost metrics. This work advances formal verification research in C++ and provides a practical resource for benchmarking and improving automated specification generation and validation.
Abstract
FormalSpecCpp is a dataset designed to fill the gap in standardized benchmarks for verifying formal specifications in C++ programs. To the best of our knowledge, this is the first comprehensive collection of C++ programs with well-defined preconditions and postconditions. It provides a structured benchmark for evaluating specification inference tools and testing theaccuracy of generated specifications. Researchers and developers can use this dataset to benchmark specification inference tools,fine-tune Large Language Models (LLMs) for automated specification generation, and analyze the role of formal specifications in improving program verification and automated testing. By making this dataset publicly available, we aim to advance research in program verification, specification inference, and AI-assisted software development. The dataset and the code are available at https://github.com/MadhuNimmo/FormalSpecCpp.
