Table of Contents
Fetching ...

Input Validation for Neural Networks via Runtime Local Robustness Verification

Jiangchao Liu, Liqian Chen, Antoine Mine, Ji Wang

TL;DR

This work tackles the challenge of ensuring neural network reliability under adversarial perturbations by proposing runtime input validation based on local robustness verification. It identifies two core observations: valid inputs exhibit substantially larger robustness radii than misclassified or adversarial inputs, and these radii for valid inputs often follow a normal distribution. Leveraging these insights, it introduces two validation methods—validation by threshold and validation by distribution—that operate with complete or incomplete verifiers to reject suspicious inputs at runtime and improve accuracy without attack-specific assumptions. The approach demonstrates strong empirical protection against adversarial examples, particularly for strong attacks, and shows practical runtime viability, suggesting a meaningful path toward safer deployment of neural networks in safety-critical settings.

Abstract

Local robustness verification can verify that a neural network is robust wrt. any perturbation to a specific input within a certain distance. We call this distance Robustness Radius. We observe that the robustness radii of correctly classified inputs are much larger than that of misclassified inputs which include adversarial examples, especially those from strong adversarial attacks. Another observation is that the robustness radii of correctly classified inputs often follow a normal distribution. Based on these two observations, we propose to validate inputs for neural networks via runtime local robustness verification. Experiments show that our approach can protect neural networks from adversarial examples and improve their accuracies.

Input Validation for Neural Networks via Runtime Local Robustness Verification

TL;DR

This work tackles the challenge of ensuring neural network reliability under adversarial perturbations by proposing runtime input validation based on local robustness verification. It identifies two core observations: valid inputs exhibit substantially larger robustness radii than misclassified or adversarial inputs, and these radii for valid inputs often follow a normal distribution. Leveraging these insights, it introduces two validation methods—validation by threshold and validation by distribution—that operate with complete or incomplete verifiers to reject suspicious inputs at runtime and improve accuracy without attack-specific assumptions. The approach demonstrates strong empirical protection against adversarial examples, particularly for strong attacks, and shows practical runtime viability, suggesting a meaningful path toward safer deployment of neural networks in safety-critical settings.

Abstract

Local robustness verification can verify that a neural network is robust wrt. any perturbation to a specific input within a certain distance. We call this distance Robustness Radius. We observe that the robustness radii of correctly classified inputs are much larger than that of misclassified inputs which include adversarial examples, especially those from strong adversarial attacks. Another observation is that the robustness radii of correctly classified inputs often follow a normal distribution. Based on these two observations, we propose to validate inputs for neural networks via runtime local robustness verification. Experiments show that our approach can protect neural networks from adversarial examples and improve their accuracies.

Paper Structure

This paper contains 13 sections, 5 equations, 8 figures, 2 tables, 2 algorithms.

Figures (8)

  • Figure 1: Robustness Radius from Complete Verification
  • Figure 2: The numbers of inputs which have a larger exact robustness radius on FNN-MNIST than a given value
  • Figure 3: The exact/approximate robustness radii of the first 100 valid inputs
  • Figure 4: The numbers of inputs which have a larger approximate robustness radius on FNN-MNIST than a given value
  • Figure 5: The numbers of inputs which have a larger approximate robustness radius on CNN-MNIST than a given value
  • ...and 3 more figures