rSDNet: Unified Robust Neural Learning against Label Noise and Adversarial Attacks

Suryasis Jana; Abhik Ghosh

rSDNet: Unified Robust Neural Learning against Label Noise and Adversarial Attacks

Suryasis Jana, Abhik Ghosh

Abstract

Neural networks are central to modern artificial intelligence, yet their training remains highly sensitive to data contamination. Standard neural classifiers are trained by minimizing the categorical cross-entropy loss, corresponding to maximum likelihood estimation under a multinomial model. While statistically efficient under ideal conditions, this approach is highly vulnerable to contaminated observations including label noises corrupting supervision in the output space, and adversarial perturbations inducing worst-case deviations in the input space. In this paper, we propose a unified and statistically grounded framework for robust neural classification that addresses both forms of contamination within a single learning objective. We formulate neural network training as a minimum-divergence estimation problem and introduce rSDNet, a robust learning algorithm based on the general class of $S$-divergences. The resulting training objective inherits robustness properties from classical statistical estimation, automatically down-weighting aberrant observations through model probabilities. We establish essential population-level properties of rSDNet, including Fisher consistency, classification calibration implying Bayes optimality, and robustness guarantees under uniform label noise and infinitesimal feature contamination. Experiments on three benchmark image classification datasets show that rSDNet improves robustness to label corruption and adversarial attacks while maintaining competitive accuracy on clean data, Our results highlight minimum-divergence learning as a principled and effective framework for robust neural classification under heterogeneous data contamination.

rSDNet: Unified Robust Neural Learning against Label Noise and Adversarial Attacks

Abstract

-divergences. The resulting training objective inherits robustness properties from classical statistical estimation, automatically down-weighting aberrant observations through model probabilities. We establish essential population-level properties of rSDNet, including Fisher consistency, classification calibration implying Bayes optimality, and robustness guarantees under uniform label noise and infinitesimal feature contamination. Experiments on three benchmark image classification datasets show that rSDNet improves robustness to label corruption and adversarial attacks while maintaining competitive accuracy on clean data, Our results highlight minimum-divergence learning as a principled and effective framework for robust neural classification under heterogeneous data contamination.

Paper Structure (23 sections, 5 theorems, 38 equations, 6 figures, 5 tables)

This paper contains 23 sections, 5 theorems, 38 equations, 6 figures, 5 tables.

Introduction
The proposed robust learning framework
Model setup and notations
Minimum divergence learning framework
$S$-Divergence family and the rSDNet objective
The final rSDNet learning algorithm
Theoretical guarantees
Statistical consistency of rSDNet functionals
Tolerance against uniform label noise
Local robustness against contaminated features
Empirical Evaluation on Image Classification
Experimental setups: Datasets and NN architectures
Performances under clean data
Robustness against uniform label noises
Stability against diverse adversarial attacks
...and 8 more sections

Key Result

Theorem 3.1

For any $(\beta, \lambda)\in\mathcal{T}$, the posterior class probabilities of the rSDNet functional, obtained using the MSDF in EQ:rsdnet-func, satisfy for any marginal distribution $G_{\bm{X}}$, provided that the conditional model is correctly specified with $g \equiv f_{\bm{\theta}_0}$ for some $\bm{\theta}_0 \in \Theta$.

Figures (6)

Figure 1: Plots of the bound $M_{\beta,\lambda}^{\eta}$ on the excess risk of rSDNet, as a function of tuning parameters $(\beta, \lambda)$, for different contamination proportion $\eta$ and $J=10$.
Figure 2: IFs for the MSDFs of parameters under the NN model (M1) in Example \ref{['EX:IF_ex']}, for different tuning parameters $(\beta, \lambda)$.
Figure 3: IFs for the MSDFs of parameters under the NN model (M2) in Example \ref{['EX:IF_ex']}, for different tuning parameters $(\beta, \lambda)$.
Figure 4: IFs for the MSDFs of parameters under the NN model (M3) in Example \ref{['EX:IF_ex']}, for different tuning parameters $(\beta, \lambda)$.
Figure 5: Test accuracies obtained by different NN learning methods trained with varying numbers of epochs for the MNIST dataset
...and 1 more figures

Theorems & Definitions (7)

Theorem 3.1: Fisher Consistency of rSDNet
Theorem 3.2: Classification calibration of rSDNet
Lemma 3.3
Theorem 3.4
Theorem 3.5
Remark 3.1
Example 3.1

rSDNet: Unified Robust Neural Learning against Label Noise and Adversarial Attacks

Abstract

rSDNet: Unified Robust Neural Learning against Label Noise and Adversarial Attacks

Authors

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (7)