Faster Verified Explanations for Neural Networks
Alessandro De Palma, Greta Dolcetti, Caterina Urban
TL;DR
The paper tackles the scalability challenge of verified explanations for neural networks by introducing FaVeX, a batch-sequential algorithm that accelerates robustness queries and incorporates verifier incompleteness through verifier-optimal robust explanations. By partitioning features into invariants, counterfactuals, and unknowns, it provides a practical, hierarchical explanation that adapts to incomplete verification. FaVeX combines batch processing, incremental branch-and-bound reuse, and restricted-space counterfactual search to achieve large-scale explanations on networks with hundreds of thousands of activations, while also delivering improved counterfactual discovery on larger CNNs. The work demonstrates substantial speedups over prior methods and shows that verifier-optimal explanations offer meaningful, scalable insights for safety-critical models. Overall, it advances formal explainability toward real-world applicability in vision tasks and beyond.
Abstract
Verified explanations are a theoretically-principled way to explain the decisions taken by neural networks, which are otherwise black-box in nature. However, these techniques face significant scalability challenges, as they require multiple calls to neural network verifiers, each of them with an exponential worst-case complexity. We present FaVeX, a novel algorithm to compute verified explanations. FaVeX accelerates the computation by dynamically combining batch and sequential processing of input features, and by reusing information from previous queries, both when proving invariances with respect to certain input features, and when searching for feature assignments altering the prediction. Furthermore, we present a novel and hierarchical definition of verified explanations, termed verifier-optimal robust explanations, that explicitly factors the incompleteness of network verifiers within the explanation. Our comprehensive experimental evaluation demonstrates the superior scalability of both FaVeX, and of verifier-optimal robust explanations, which together can produce meaningful formal explanation on networks with hundreds of thousands of non-linear activations.
