Formally Verified Approximate Policy Iteration
Maximilian Schäffeler, Mohammad Abdulaziz
TL;DR
The paper tackles the challenge of formally verifying an approximate policy iteration algorithm for factored MDPs using Isabelle/HOL, and then derives a verified executable implementation that includes a certified LP solver. It introduces a refinement workflow based on locales to connect abstract specifications to concrete code, and demonstrates a Scala-exportable pipeline with LP certificate checking for practical deployment. Empirical evaluation on a ring-network domain shows the approach scales to hundreds of thousands of states and confirms convergence within a handful of iterations, while also revealing solver limitations. The work advances rigorous verification for planning under uncertainty, offering reusable formal foundations and a concrete pathway to trustworthy, verifiable AI solvers for large, structured MDPs.
Abstract
We formally verify an algorithm for approximate policy iteration on Factored Markov Decision Processes using the interactive theorem prover Isabelle/HOL. Next, we show how the formalized algorithm can be refined to an executable, verified implementation. The implementation is evaluated on benchmark problems to show its practicability. As part of the refinement, we develop verified software to certify Linear Programming solutions. The algorithm builds on a diverse library of formalized mathematics and pushes existing methodologies for interactive theorem provers to the limits. We discuss the process of the verification project and the modifications to the algorithm needed for formal verification.
