Table of Contents
Fetching ...

Set-Valued Sensitivity Analysis of Deep Neural Networks

Xin Wang, Feilong Wang, Xuegang Ban

TL;DR

This work tackles how training solutions of deep neural networks respond to perturbations in the training data by treating the solution as a set-valued mapping $S(x)$, since DNNs can exhibit non-unique minima and non-strongly-convex losses. It develops a set-valued sensitivity framework that proves a Lipschitz-like property for the solution map of Deep Fully Connected Neural Networks (DFCNN) and provides a graphical-derivative-based method to estimate the new solution set after data perturbations without retraining. The approach yields a practical estimator: $S(x^p)\approx \bar{w}+D S(\bar{x}|\bar{w})(\Delta x)$, with an explicit expression for the graphical derivative, and reduces to the influence function in the isolated-minimum case. Empirical studies on a toy linear network and ResNet-56 on CIFAR-10 validate the theory, showing accurate prediction of solution-set shifts and near-zero training loss for estimated solutions under data poisoning-like perturbations. Overall, the paper extends implicit-function ideas to a set-valued regime, offering a robust framework to assess training stability and to study data-poisoning scenarios without relying on Hessian non-singularity.

Abstract

This paper proposes a sensitivity analysis framework based on set valued mapping for deep neural networks (DNN) to understand and compute how the solutions (model weights) of DNN respond to perturbations in the training data. As a DNN may not exhibit a unique solution (minima) and the algorithm of solving a DNN may lead to different solutions with minor perturbations to input data, we focus on the sensitivity of the solution set of DNN, instead of studying a single solution. In particular, we are interested in the expansion and contraction of the set in response to data perturbations. If the change of solution set can be bounded by the extent of the data perturbation, the model is said to exhibit the Lipschitz like property. This "set-to-set" analysis approach provides a deeper understanding of the robustness and reliability of DNNs during training. Our framework incorporates both isolated and non-isolated minima, and critically, does not require the assumption that the Hessian of loss function is non-singular. By developing set-level metrics such as distance between sets, convergence of sets, derivatives of set-valued mapping, and stability across the solution set, we prove that the solution set of the Fully Connected Neural Network holds Lipschitz-like properties. For general neural networks (e.g., Resnet), we introduce a graphical-derivative-based method to estimate the new solution set following data perturbation without retraining.

Set-Valued Sensitivity Analysis of Deep Neural Networks

TL;DR

This work tackles how training solutions of deep neural networks respond to perturbations in the training data by treating the solution as a set-valued mapping , since DNNs can exhibit non-unique minima and non-strongly-convex losses. It develops a set-valued sensitivity framework that proves a Lipschitz-like property for the solution map of Deep Fully Connected Neural Networks (DFCNN) and provides a graphical-derivative-based method to estimate the new solution set after data perturbations without retraining. The approach yields a practical estimator: , with an explicit expression for the graphical derivative, and reduces to the influence function in the isolated-minimum case. Empirical studies on a toy linear network and ResNet-56 on CIFAR-10 validate the theory, showing accurate prediction of solution-set shifts and near-zero training loss for estimated solutions under data poisoning-like perturbations. Overall, the paper extends implicit-function ideas to a set-valued regime, offering a robust framework to assess training stability and to study data-poisoning scenarios without relying on Hessian non-singularity.

Abstract

This paper proposes a sensitivity analysis framework based on set valued mapping for deep neural networks (DNN) to understand and compute how the solutions (model weights) of DNN respond to perturbations in the training data. As a DNN may not exhibit a unique solution (minima) and the algorithm of solving a DNN may lead to different solutions with minor perturbations to input data, we focus on the sensitivity of the solution set of DNN, instead of studying a single solution. In particular, we are interested in the expansion and contraction of the set in response to data perturbations. If the change of solution set can be bounded by the extent of the data perturbation, the model is said to exhibit the Lipschitz like property. This "set-to-set" analysis approach provides a deeper understanding of the robustness and reliability of DNNs during training. Our framework incorporates both isolated and non-isolated minima, and critically, does not require the assumption that the Hessian of loss function is non-singular. By developing set-level metrics such as distance between sets, convergence of sets, derivatives of set-valued mapping, and stability across the solution set, we prove that the solution set of the Fully Connected Neural Network holds Lipschitz-like properties. For general neural networks (e.g., Resnet), we introduce a graphical-derivative-based method to estimate the new solution set following data perturbation without retraining.

Paper Structure

This paper contains 12 sections, 6 theorems, 58 equations, 5 figures, 3 tables, 1 algorithm.

Key Result

Theorem 1

For given $\bar{x}_k$ and $\bar{w}$, the graphical derivative and coderivative of F (defined by Definition def:derivative) at $\bar{x}_k$ for $\bar{w}$ have the formulas: Proof: see appendix appendix A.

Figures (5)

  • Figure 1: Illustration of $e(C,D)$ and $e(D,C)$, where $C$ and $D$ are two closed sets. The Pompeiu-Hausdorff Distance of $C$ beyond $D$ is $h(C, D)=\max \{e(C, D), e(D, C)\}=e(D,C)$.
  • Figure 2: The location of the original solution set $S(\bar{x})$ and the new solution set $S(x^p)$. It presents how the $S(\bar{x})$ expands to achieve $S(x^p)$, where the red area is the expanded area of $S(\bar{x})$ as $S(\bar{x})+\kappa\left\|x-x^p\right\| \mathbb{B}$.
  • Figure 3: Left: Respective solution set of the pristine and poisoned model. Right: The loss landscape of the poisoned toy model. The black and red points indicate the positions of the original solution $\bar{w}$ and the estimated solutions, respectively. The contour lines represent the corresponding losses.
  • Figure 4: 3D image(left) and 2D contours(right) of the loss landscape of poisoned Resnet56, indicating how the $\Delta w$ draws the original solution towards the valley of the landscape.
  • Figure 5: Left: Illustrative diagram showing the normal cone $N_{\Omega}(x)$ and tangent cone $T_{\Omega}(x)$ at point $x$ within the set $\Omega$.. Right: $\eta$ is a tangent vector of $\Gamma$ at $\bar{\gamma}$ if there exists $\left\{\gamma_i\right\} \subset \Gamma$ with $\gamma_i \rightarrow \bar{\gamma}$, and a positive scalar sequence $\tau_i$ such that $\tau_i\rightarrow 0$ with $\left(\gamma_i-\bar{\gamma}\right) / \tau_i \rightarrow \eta$..

Theorems & Definitions (11)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Lemma 4
  • Lemma 5
  • ...and 1 more