Table of Contents
Fetching ...

On the Probabilistic Learnability of Compact Neural Network Preimage Bounds

Luca Marzari, Manuele Bicego, Ferdinando Cicalese, Alessandro Farinelli

TL;DR

The paper tackles the intractability of exactly computing neural network preimage bounds, showing that the problem is $\#P$-hard, and introduces RF-ProVe, a probabilistic framework based on a random-forest ensemble to produce unions of axis-aligned boxes that approximate the preimage with high confidence. It provides Wilks-based probabilistic guarantees on region purity and global coverage through an active resampling step that validates leaf purity, achieving compact representations while maintaining coverage. Empirical results on safety-critical benchmarks (e.g., VCAS) and reinforcement learning tasks demonstrate that RF-ProVe attains comparable coverage to exact or prior probabilistic methods using far fewer regions and with substantially faster runtimes, especially in high-dimensional settings. The work shows strong scalability and practical value for safety analysis, debugging, and safe recovery by delivering interpretable, probabilistic preimage bounds that balance accuracy and computational feasibility.

Abstract

Although recent provable methods have been developed to compute preimage bounds for neural networks, their scalability is fundamentally limited by the #P-hardness of the problem. In this work, we adopt a novel probabilistic perspective, aiming to deliver solutions with high-confidence guarantees and bounded error. To this end, we investigate the potential of bootstrap-based and randomized approaches that are capable of capturing complex patterns in high-dimensional spaces, including input regions where a given output property holds. In detail, we introduce $\textbf{R}$andom $\textbf{F}$orest $\textbf{Pro}$perty $\textbf{Ve}$rifier ($\texttt{RF-ProVe}$), a method that exploits an ensemble of randomized decision trees to generate candidate input regions satisfying a desired output property and refines them through active resampling. Our theoretical derivations offer formal statistical guarantees on region purity and global coverage, providing a practical, scalable solution for computing compact preimage approximations in cases where exact solvers fail to scale.

On the Probabilistic Learnability of Compact Neural Network Preimage Bounds

TL;DR

The paper tackles the intractability of exactly computing neural network preimage bounds, showing that the problem is -hard, and introduces RF-ProVe, a probabilistic framework based on a random-forest ensemble to produce unions of axis-aligned boxes that approximate the preimage with high confidence. It provides Wilks-based probabilistic guarantees on region purity and global coverage through an active resampling step that validates leaf purity, achieving compact representations while maintaining coverage. Empirical results on safety-critical benchmarks (e.g., VCAS) and reinforcement learning tasks demonstrate that RF-ProVe attains comparable coverage to exact or prior probabilistic methods using far fewer regions and with substantially faster runtimes, especially in high-dimensional settings. The work shows strong scalability and practical value for safety analysis, debugging, and safe recovery by delivering interpretable, probabilistic preimage bounds that balance accuracy and computational feasibility.

Abstract

Although recent provable methods have been developed to compute preimage bounds for neural networks, their scalability is fundamentally limited by the #P-hardness of the problem. In this work, we adopt a novel probabilistic perspective, aiming to deliver solutions with high-confidence guarantees and bounded error. To this end, we investigate the potential of bootstrap-based and randomized approaches that are capable of capturing complex patterns in high-dimensional spaces, including input regions where a given output property holds. In detail, we introduce andom orest perty rifier (), a method that exploits an ensemble of randomized decision trees to generate candidate input regions satisfying a desired output property and refines them through active resampling. Our theoretical derivations offer formal statistical guarantees on region purity and global coverage, providing a practical, scalable solution for computing compact preimage approximations in cases where exact solvers fail to scale.

Paper Structure

This paper contains 10 sections, 5 theorems, 4 equations, 4 figures, 2 tables, 1 algorithm.

Key Result

Lemma 1

Fix a function $g: \mathbb{R}^d \mapsto \mathbb{R}.$ For any $R \in (0,1)$ and integer $n$, given a sample $X_1$ of $n$ values from a (continuous) set $X \subseteq \mathbb{R}^d$ the probability that for at least a fraction $R$ of the values in a further possibly infinite sequence of samples $x$ from

Figures (4)

  • Figure 1: Illustrative overview of AllDNN-Verification problem.
  • Figure 2: Explanatory image of the solution returned by our RF-ProVe.
  • Figure 3: Correlation samples complexity, number of trees, and depth decision trees.
  • Figure 4: Scalability results on RF-ProVe for growing input dimensionality.

Theorems & Definitions (11)

  • Definition 1: AllDNN-Verification Problem
  • Definition 2: Approximate AllDNN-Verification
  • Definition 3: Probabilistic Approximate AllDNN-Verification
  • Lemma 1: wilks
  • Corollary 2
  • Definition 4: $\xi$-bounded hyperrectangle
  • Lemma 3: Positive Samples in $b^{(\xi)}$
  • proof
  • Theorem 4: Coverage Guarantees of RF-ProVe
  • proof
  • ...and 1 more