The CAISAR Platform: Extending the Reach of Machine Learning Specification and Verification
Michele Alberti, François Bobot, Julien Girard-Satabin, Alban Grastien, Aymeric Varasse, Zakaria Chihani
TL;DR
The paper introduces CAISAR, an open-source platform that extends formal ML specification and verification by moving beyond local robustness through a Why3-based pipeline and a Neural Intermediate Representation (NIR) that can express properties across multiple models. It presents a high-level CAISAR specification language and an automated graph-editing approach to translate these specs into inputs for off-the-shelf provers, including VNN-LIB-compatible backends, enabling cross-verification among diverse tools. The authors demonstrate use cases like ACAS Xu with unnormalized inputs and multi-network compositions to illustrate expressiveness and practical utility, while also discussing limitations and tool-dependent performance. Overall, CAISAR aims to bridge the gap between high-level ML specifications and diverse provers, facilitating principled verification of richer semantic properties and supporting industrial adoption with reproducible artifacts and guidance for future enhancements.
Abstract
The formal specification and verification of machine learning programs saw remarkable progress in less than a decade, leading to a profusion of tools. However, diversity may lead to fragmentation, resulting in tools that are difficult to compare, except for very specific benchmarks. Furthermore, this progress is heavily geared towards the specification and verification of a certain class of property, that is, local robustness properties. But while provers are becoming more and more efficient at solving local robustness properties, even slightly more complex properties, involving multiple neural networks for example, cannot be expressed in the input languages of winners of the International Competition of Verification of Neural Networks VNN-Comp. In this tool paper, we present CAISAR, an open-source platform dedicated to machine learning specification and verification. We present its specification language, suitable for modelling complex properties on neural networks, support vector machines and boosted trees. We show on concrete use-cases how specifications written in this language are automatically translated to queries to state-of-the-art provers, notably by using automated graph editing techniques, making it possible to use their off-the-shelf versions. The artifact to reproduce the paper claims is available at the following DOI: https://doi.org/10.5281/zenodo.15209510
