Table of Contents
Fetching ...

RocqStar: Leveraging Similarity-driven Retrieval and Agentic Systems for Rocq generation

Andrei Kozyrev, Nikita Khramov, Gleb Solovev, Anton Podkopaev

TL;DR

The paper tackles the challenge of improving Rocq theorem generation by combining similarity-aware premise retrieval with an agentic proving system. It introduces RocqStar, a retrieval mechanism trained to predict proof similarity using data mined via BigRocq, enabling evidence-aware context selection and achieving up to 28% relative improvements over baselines. It also presents an autonomous agentic pipeline with multi-agent debate planning and a reflection loop, integrated through a Model Context Protocol server, achieving up to 60% proof success on the IMM-300 dataset and demonstrating the critical roles of planning quality and reflective replanning. Together, these contributions advance retrieval-augmented generation for formal verification and illustrate the practical viability of agentic systems in interactive theorem proving.

Abstract

Interactive Theorem Proving was repeatedly shown to be fruitful combined with Generative Artificial Intelligence. This paper assesses multiple approaches to Rocq generation and illuminates potential avenues for improvement. We highlight the importance of thorough premise selection for generating Rocq proofs and propose a novel approach, leveraging retrieval via a self-attentive embedder model. The evaluation of the designed approach shows up to 28% relative increase of the generator's performance. We tackle the problem of writing Rocq proofs using a multi-stage agentic system, tailored for formal verification, and demonstrate its high effectiveness. We conduct an ablation study and demonstrate shows that incorporating multi-agent debate during the planning stage increases the proof success rate by 20% overall and nearly doubles it for complex theorems, while the reflection mechanism further enhances stability and consistency.

RocqStar: Leveraging Similarity-driven Retrieval and Agentic Systems for Rocq generation

TL;DR

The paper tackles the challenge of improving Rocq theorem generation by combining similarity-aware premise retrieval with an agentic proving system. It introduces RocqStar, a retrieval mechanism trained to predict proof similarity using data mined via BigRocq, enabling evidence-aware context selection and achieving up to 28% relative improvements over baselines. It also presents an autonomous agentic pipeline with multi-agent debate planning and a reflection loop, integrated through a Model Context Protocol server, achieving up to 60% proof success on the IMM-300 dataset and demonstrating the critical roles of planning quality and reflective replanning. Together, these contributions advance retrieval-augmented generation for formal verification and illustrate the practical viability of agentic systems in interactive theorem proving.

Abstract

Interactive Theorem Proving was repeatedly shown to be fruitful combined with Generative Artificial Intelligence. This paper assesses multiple approaches to Rocq generation and illuminates potential avenues for improvement. We highlight the importance of thorough premise selection for generating Rocq proofs and propose a novel approach, leveraging retrieval via a self-attentive embedder model. The evaluation of the designed approach shows up to 28% relative increase of the generator's performance. We tackle the problem of writing Rocq proofs using a multi-stage agentic system, tailored for formal verification, and demonstrate its high effectiveness. We conduct an ablation study and demonstrate shows that incorporating multi-agent debate during the planning stage increases the proof success rate by 20% overall and nearly doubles it for complex theorems, while the reflection mechanism further enhances stability and consistency.

Paper Structure

This paper contains 20 sections, 9 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Processing theorems into trees. $s_i$ denotes a state
  • Figure 2: Agentic pipeline with RocqStar retriever.
  • Figure 3: Theorems with dissimilar statements and similar proofs