Table of Contents
Fetching ...

rSDNet: Unified Robust Neural Learning against Label Noise and Adversarial Attacks

Suryasis Jana, Abhik Ghosh

Abstract

Neural networks are central to modern artificial intelligence, yet their training remains highly sensitive to data contamination. Standard neural classifiers are trained by minimizing the categorical cross-entropy loss, corresponding to maximum likelihood estimation under a multinomial model. While statistically efficient under ideal conditions, this approach is highly vulnerable to contaminated observations including label noises corrupting supervision in the output space, and adversarial perturbations inducing worst-case deviations in the input space. In this paper, we propose a unified and statistically grounded framework for robust neural classification that addresses both forms of contamination within a single learning objective. We formulate neural network training as a minimum-divergence estimation problem and introduce rSDNet, a robust learning algorithm based on the general class of $S$-divergences. The resulting training objective inherits robustness properties from classical statistical estimation, automatically down-weighting aberrant observations through model probabilities. We establish essential population-level properties of rSDNet, including Fisher consistency, classification calibration implying Bayes optimality, and robustness guarantees under uniform label noise and infinitesimal feature contamination. Experiments on three benchmark image classification datasets show that rSDNet improves robustness to label corruption and adversarial attacks while maintaining competitive accuracy on clean data, Our results highlight minimum-divergence learning as a principled and effective framework for robust neural classification under heterogeneous data contamination.

rSDNet: Unified Robust Neural Learning against Label Noise and Adversarial Attacks

Abstract

Neural networks are central to modern artificial intelligence, yet their training remains highly sensitive to data contamination. Standard neural classifiers are trained by minimizing the categorical cross-entropy loss, corresponding to maximum likelihood estimation under a multinomial model. While statistically efficient under ideal conditions, this approach is highly vulnerable to contaminated observations including label noises corrupting supervision in the output space, and adversarial perturbations inducing worst-case deviations in the input space. In this paper, we propose a unified and statistically grounded framework for robust neural classification that addresses both forms of contamination within a single learning objective. We formulate neural network training as a minimum-divergence estimation problem and introduce rSDNet, a robust learning algorithm based on the general class of -divergences. The resulting training objective inherits robustness properties from classical statistical estimation, automatically down-weighting aberrant observations through model probabilities. We establish essential population-level properties of rSDNet, including Fisher consistency, classification calibration implying Bayes optimality, and robustness guarantees under uniform label noise and infinitesimal feature contamination. Experiments on three benchmark image classification datasets show that rSDNet improves robustness to label corruption and adversarial attacks while maintaining competitive accuracy on clean data, Our results highlight minimum-divergence learning as a principled and effective framework for robust neural classification under heterogeneous data contamination.
Paper Structure (23 sections, 5 theorems, 38 equations, 6 figures, 5 tables)

This paper contains 23 sections, 5 theorems, 38 equations, 6 figures, 5 tables.

Key Result

Theorem 3.1

For any $(\beta, \lambda)\in\mathcal{T}$, the posterior class probabilities of the rSDNet functional, obtained using the MSDF in EQ:rsdnet-func, satisfy for any marginal distribution $G_{\bm{X}}$, provided that the conditional model is correctly specified with $g \equiv f_{\bm{\theta}_0}$ for some $\bm{\theta}_0 \in \Theta$.

Figures (6)

  • Figure 1: Plots of the bound $M_{\beta,\lambda}^{\eta}$ on the excess risk of rSDNet, as a function of tuning parameters $(\beta, \lambda)$, for different contamination proportion $\eta$ and $J=10$.
  • Figure 2: IFs for the MSDFs of parameters under the NN model (M1) in Example \ref{['EX:IF_ex']}, for different tuning parameters $(\beta, \lambda)$.
  • Figure 3: IFs for the MSDFs of parameters under the NN model (M2) in Example \ref{['EX:IF_ex']}, for different tuning parameters $(\beta, \lambda)$.
  • Figure 4: IFs for the MSDFs of parameters under the NN model (M3) in Example \ref{['EX:IF_ex']}, for different tuning parameters $(\beta, \lambda)$.
  • Figure 5: Test accuracies obtained by different NN learning methods trained with varying numbers of epochs for the MNIST dataset
  • ...and 1 more figures

Theorems & Definitions (7)

  • Theorem 3.1: Fisher Consistency of rSDNet
  • Theorem 3.2: Classification calibration of rSDNet
  • Lemma 3.3
  • Theorem 3.4
  • Theorem 3.5
  • Remark 3.1
  • Example 3.1