Table of Contents
Fetching ...

Formally Verified Approximate Policy Iteration

Maximilian Schäffeler, Mohammad Abdulaziz

TL;DR

The paper tackles the challenge of formally verifying an approximate policy iteration algorithm for factored MDPs using Isabelle/HOL, and then derives a verified executable implementation that includes a certified LP solver. It introduces a refinement workflow based on locales to connect abstract specifications to concrete code, and demonstrates a Scala-exportable pipeline with LP certificate checking for practical deployment. Empirical evaluation on a ring-network domain shows the approach scales to hundreds of thousands of states and confirms convergence within a handful of iterations, while also revealing solver limitations. The work advances rigorous verification for planning under uncertainty, offering reusable formal foundations and a concrete pathway to trustworthy, verifiable AI solvers for large, structured MDPs.

Abstract

We formally verify an algorithm for approximate policy iteration on Factored Markov Decision Processes using the interactive theorem prover Isabelle/HOL. Next, we show how the formalized algorithm can be refined to an executable, verified implementation. The implementation is evaluated on benchmark problems to show its practicability. As part of the refinement, we develop verified software to certify Linear Programming solutions. The algorithm builds on a diverse library of formalized mathematics and pushes existing methodologies for interactive theorem provers to the limits. We discuss the process of the verification project and the modifications to the algorithm needed for formal verification.

Formally Verified Approximate Policy Iteration

TL;DR

The paper tackles the challenge of formally verifying an approximate policy iteration algorithm for factored MDPs using Isabelle/HOL, and then derives a verified executable implementation that includes a certified LP solver. It introduces a refinement workflow based on locales to connect abstract specifications to concrete code, and demonstrates a Scala-exportable pipeline with LP certificate checking for practical deployment. Empirical evaluation on a ring-network domain shows the approach scales to hundreds of thousands of states and confirms convergence within a handful of iterations, while also revealing solver limitations. The work advances rigorous verification for planning under uncertainty, offering reusable formal foundations and a concrete pathway to trustworthy, verifiable AI solvers for large, structured MDPs.

Abstract

We formally verify an algorithm for approximate policy iteration on Factored Markov Decision Processes using the interactive theorem prover Isabelle/HOL. Next, we show how the formalized algorithm can be refined to an executable, verified implementation. The implementation is evaluated on benchmark problems to show its practicability. As part of the refinement, we develop verified software to certify Linear Programming solutions. The algorithm builds on a diverse library of formalized mathematics and pushes existing methodologies for interactive theorem provers to the limits. We discuss the process of the verification project and the modifications to the algorithm needed for formal verification.
Paper Structure (27 sections, 1 theorem, 11 equations, 2 figures, 2 tables)

This paper contains 27 sections, 1 theorem, 11 equations, 2 figures, 2 tables.

Key Result

Theorem 1

Let $api(w_0, \mathit{\pi}_0) = (t', \mathit{\pi}, w, \mathit{err}, \mathit{True})$. Then $(1 - \mathit{\gamma}) \| \mathit{\nu}^* - \mathit{\nu}_{w} \| \le 2\mathit{\gamma}\cdot\mathit{err}$.

Figures (2)

  • Figure 1: (a) Ring network of 4 connected computers. (b) Variable dependencies of the default action. (c) Probabilities of machine $C_1$ working in the next step, for every state of the machines $C_1$ and $C_4$.
  • Figure 2: Simplified Locale Hierarchy

Theorems & Definitions (1)

  • Theorem 1