Outlier-robust neural network training: variation regularization meets trimmed loss to prevent functional breakdown

Akifumi Okuno; Shotaro Yagishita

Outlier-robust neural network training: variation regularization meets trimmed loss to prevent functional breakdown

Akifumi Okuno, Shotaro Yagishita

TL;DR

The paper tackles outlier robustness in highly expressive neural networks by introducing a transformed trimmed loss (TTL) combined with higher-order variation regularization (HOVR), forming the augmented and regularized TTL (ARTL) objective. It redefines robustness through a functional breakdown point $\igl\mathcal{E}_{k,q}^*(f_{\hat{\theta}},Z)\bigr>$ and proves a guarantee $\,\mathcal{E}_{k,q}^*(f_{\hat{\theta}},Z) \ge \frac{n-h+1}{n}$, ensuring tolerance to up to $n-h$ outliers. Optimization is achieved via stochastic gradient–supergradient descent (SGSD) on the ARTL objective, with an unbiased gradient estimator for the HOVR term and convergence guarantees under standard stochastic optimization assumptions. Empirical results on synthetic nonlinear functions and UCI benchmarks show that TTL+HOVR yields superior robustness and fidelity to the underlying signal, outperforming baselines based on robust losses and RANSAC while preserving expressive power. The work provides a scalable, theoretically groundedPath for robust neural network training in contaminated data scenarios."

Abstract

In this study, we tackle the challenge of outlier-robust predictive modeling using highly expressive neural networks. Our approach integrates two key components: (1) a transformed trimmed loss (TTL), a computationally efficient variant of the classical trimmed loss, and (2) higher-order variation regularization (HOVR), which imposes smoothness constraints on the prediction function. While traditional robust statistics typically assume low-complexity models such as linear and kernel models, applying TTL alone to modern neural networks may fail to ensure robustness, as their high expressive power allows them to fit both inliers and outliers, even when a robust loss is used. To address this, we revisit the traditional notion of breakdown point and adapt it to the nonlinear function setting, introducing a regularization scheme via HOVR that controls the model's capacity and suppresses overfitting to outliers. We theoretically establish that our training procedure retains a high functional breakdown point, thereby ensuring robustness to outlier contamination. We develop a stochastic optimization algorithm tailored to this framework and provide a theoretical guarantee of its convergence.

Outlier-robust neural network training: variation regularization meets trimmed loss to prevent functional breakdown

TL;DR

and proves a guarantee

, ensuring tolerance to up to

outliers. Optimization is achieved via stochastic gradient–supergradient descent (SGSD) on the ARTL objective, with an unbiased gradient estimator for the HOVR term and convergence guarantees under standard stochastic optimization assumptions. Empirical results on synthetic nonlinear functions and UCI benchmarks show that TTL+HOVR yields superior robustness and fidelity to the underlying signal, outperforming baselines based on robust losses and RANSAC while preserving expressive power. The work provides a scalable, theoretically groundedPath for robust neural network training in contaminated data scenarios."

Abstract

Paper Structure (37 sections, 4 theorems, 34 equations, 21 figures, 2 tables)

This paper contains 37 sections, 4 theorems, 34 equations, 21 figures, 2 tables.

Introduction
Related works
Robust NN Training to Prevent Breakdown
Neural network
Trimmed loss function
Parameter breakdown point
Functional breakdown point
HOVR
Stochastic Optimization Algorithm
ARTL and SGSD
SGSD in Practical Scenarios:
A Practical Stochastic Gradient
Convergence Analysis
Experiments
Synthetic Dataset Experiments
...and 22 more sections

Key Result

Theorem 3

Let $f_{\hat{\theta}}$ be the prediction model trained by minimzing the regularized trimmed loss eq:HOV-regularized-TTL. The following holds:

Figures (21)

Figure 1: Linear+Huber
Figure 2: NN+Huber
Figure 3: NN+Tukey
Figure 4: Ours: NN+ARTL
Figure 6: True
...and 16 more figures

Theorems & Definitions (10)

Definition 1
Definition 2
Theorem 3
Theorem 4
Corollary 1
Lemma 5
proof
proof : Proof of Theorem \ref{['thm:bound-SGSD']} (i)
proof : Proof of Theorem \ref{['thm:bound-SGSD']} (ii)
proof

Outlier-robust neural network training: variation regularization meets trimmed loss to prevent functional breakdown

TL;DR

Abstract

Outlier-robust neural network training: variation regularization meets trimmed loss to prevent functional breakdown

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (21)

Theorems & Definitions (10)