Towards a Certified Proof Checker for Deep Neural Network Verification
Remi Desmartin, Omri Isac, Grant Passmore, Kathrin Stark, Guy Katz, Ekaterina Komendantskaya
TL;DR
This work addresses the need for reliable verification of DNN verifiers in safety-critical settings by introducing a formally verifiable proof checker implemented in Imandra, which uses infinite-precision arithmetic to avoid numerical instability. It adapts Marabou's UNSAT-proof checker and supports multiple checking modes to balance reliability and performance, with ongoing efforts to complete formal verification of the checker and optimize its execution. The authors demonstrate the feasibility of extracting verified OCaml code from Imandra, acknowledge a substantial but acceptable runtime overhead (roughly 150x slower than the C++ baseline) while providing a path toward system-level integration of DNN verification, and outline future work on full verification and optimization. Overall, the work lays groundwork for robust, formally verifiable integration of DNN verifiers into higher-assurance verification frameworks.
Abstract
Recent developments in deep neural networks (DNNs) have led to their adoption in safety-critical systems, which in turn has heightened the need for guaranteeing their safety. These safety properties of DNNs can be proven using tools developed by the verification community. However, these tools are themselves prone to implementation bugs and numerical stability problems, which make their reliability questionable. To overcome this, some verifiers produce proofs of their results which can be checked by a trusted checker. In this work, we present a novel implementation of a proof checker for DNN verification. It improves on existing implementations by offering numerical stability and greater verifiability. To achieve this, we leverage two key capabilities of Imandra, an industrial theorem prover: its support of infinite precision real arithmetic and its formal verification infrastructure. So far, we have implemented a proof checker in Imandra, specified its correctness properties and started to verify the checker's compliance with them. Our ongoing work focuses on completing the formal verification of the checker and further optimizing its performance.
