Learning Minimal Neural Specifications

Chuqin Geng; Zhaoyue Wang; Haolin Ye; Xujie Si

Learning Minimal Neural Specifications

Chuqin Geng, Zhaoyue Wang, Haolin Ye, Xujie Si

TL;DR

This work addresses the challenge of robust verification for neural networks by moving beyond traditional data-centered specifications to learning minimal Neural Activation Pattern (NAP) specifications. It introduces three complementary approaches—Conservative Bottom Up (Coarsen), Statistical Coarsen (StochCoarsen), and Optimistic (OptAdvPrune)—to identify a minimal, general NAP that suffices for robustness verification, thereby increasing verifiable input regions and reducing the number of neurons required. Across Wisconsin, MNIST, and ImageNet-scale networks, the approach yields minimal NAPs that cover far more data with orders of magnitude larger estimated regions, while requiring far fewer neurons than refined NAPs; OptAdvPrune offers scalable starting points, and StochCoarsen provides strong efficiency with near-minimal results. Moreover, the paper links NAP-based specifications to visual interpretability and practical defense against adversarial attacks, suggesting that essential neurons underpin robust decision-making and can serve as empirical certificates even when full verification is computationally intractable. Overall, learning minimal NAP specifications advances both the theory and practice of neural network verification by enabling broader coverage, interpretability, and scalable robustness insights.

Abstract

Formal verification is only as good as the specification of a system, which is also true for neural network verification. Existing specifications follow the paradigm of data as specification, where the local neighborhood around a reference data point is considered correct or robust. While these specifications provide a fair testbed for assessing model robustness, they are too restrictive for verifying any unseen test data points, a challenging task with significant real-world implications. Recent work shows great promise through a new paradigm, neural representation as specification, which uses neural activation patterns (NAPs) for this purpose. However, it computes the most refined NAPs, which include many redundant neurons. In this paper, we study the following problem: Given a neural network, find a minimal (general) NAP specification that is sufficient for formal verification of its robustness properties. Finding the minimal NAP specification not only expands verifiable bounds but also provides insights into which set of neurons contributes to the model's robustness. To address this problem, we propose three approaches: conservative, statistical, and optimistic. Each of these methods offers distinct strengths and trade-offs in terms of minimality and computational speed, making them suitable for scenarios with different priorities. Notably, the optimistic approach can probe potential causal links between neurons and the robustness of large vision neural networks without relying on verification tools, a task existing methods struggle to scale. Our experiments show that minimal NAP specifications use far fewer neurons than those from previous work while expanding verifiable boundaries by several orders of magnitude.

Learning Minimal Neural Specifications

TL;DR

Abstract

Paper Structure (42 sections, 8 theorems, 20 equations, 9 figures, 5 tables)

This paper contains 42 sections, 8 theorems, 20 equations, 9 figures, 5 tables.

Introduction
Background
Neural Networks for Classification Tasks
Adversarial Attacks and Neural Networks Verification Problem
Neuron Abstractions and Neural Activation Patterns
Neuron Abstractions
The unary $\dot{\mathcal{A}}$ function
The binary $\ddot{\mathcal{A}}$ function
Regions Outlined by NAPs
NAP Specifications for Robustness Verification
The statistical $\widetilde{\mathcal{A}}$ function
The non-ambiguity property
The NAP robustness property
The NAP-augmented robustness property
The minimal NAP specification problem
...and 27 more sections

Key Result

Theorem 3.2

The algorithm Coarsen returns a minimal NAP specification with $\mathcal{O}(|N|)$ calls to $\mathcal{V}$.

Figures (9)

Figure 1: Illustration of verifiable regions of the Weak Left (WL) region of the ACAS_Xu acas_xu advisories for a head-on encounter with $a_{\text{prev}} = \text{Clear of Conflict (COC)}, \, \tau = 0\text{s}$. The red bounding boxes represent the verifiable regions when using data as specifications, constrained by the reference points and their adversarial examples. The green hatched areas denote verifiable regions when using Neural Activation Patterns (NAP) as specifications.
Figure 2: A simple 2x2 fully connected neural network and a subset of its NAP in a binary tree structure.
Figure 3: Linear regions are shattered by the simple 2X2 neural network. Each linear region corresponds to a most refined NAP, but not necessarily vice versa. More abstracted NAPs are formed by ignoring lines/hyperplanes created by neurons.
Figure 4: Geometric interpretation of NAPs on essential neurons. The first subfigure represents the case when the two NAPs disagree on $N_{4,1}$. The other three subfigures represent the three cases when the two NAPs disagree on $N_{4,1},N_{2,1}$. Regions colored green pass verification, whereas red indicates an adversarial example exists. Here, we omit states for $N_{1,1},N_{3,1}$ in those NAPs for simplicity.
Figure 5: Volume Estimation of $R_{P}$ using an orthotope in a 2-dimensional case. The gray rectangle represents the input space, with the training set depicted by a collection of data points $+$. The polygon corresponds to some NAP $P$. Initially, we identify an anchor point, denoted by $+$. Then, we construct the orthotope, represented by the green rectangle, by extending upper and lower bounds starting from the anchor point until it extends beyond $P$.
...and 4 more figures

Theorems & Definitions (14)

Definition 2.1: Neuron Abstraction Function
Definition 2.2: Neural Activation Pattern
Definition 2.3: Partially ordered NAP
Definition 2.4: Class NAP
Definition 3.1: The minimal NAP specification problem
Theorem 3.2
Theorem 3.3
Definition 4.1: Essential neuron
Theorem C.1
Theorem C.2
...and 4 more

Learning Minimal Neural Specifications

TL;DR

Abstract

Learning Minimal Neural Specifications

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (14)