Table of Contents
Fetching ...

Outlier-robust neural network training: variation regularization meets trimmed loss to prevent functional breakdown

Akifumi Okuno, Shotaro Yagishita

TL;DR

The paper tackles outlier robustness in highly expressive neural networks by introducing a transformed trimmed loss (TTL) combined with higher-order variation regularization (HOVR), forming the augmented and regularized TTL (ARTL) objective. It redefines robustness through a functional breakdown point $\igl\mathcal{E}_{k,q}^*(f_{\hat{\theta}},Z)\bigr>$ and proves a guarantee $\,\mathcal{E}_{k,q}^*(f_{\hat{\theta}},Z) \ge \frac{n-h+1}{n}$, ensuring tolerance to up to $n-h$ outliers. Optimization is achieved via stochastic gradient–supergradient descent (SGSD) on the ARTL objective, with an unbiased gradient estimator for the HOVR term and convergence guarantees under standard stochastic optimization assumptions. Empirical results on synthetic nonlinear functions and UCI benchmarks show that TTL+HOVR yields superior robustness and fidelity to the underlying signal, outperforming baselines based on robust losses and RANSAC while preserving expressive power. The work provides a scalable, theoretically groundedPath for robust neural network training in contaminated data scenarios."

Abstract

In this study, we tackle the challenge of outlier-robust predictive modeling using highly expressive neural networks. Our approach integrates two key components: (1) a transformed trimmed loss (TTL), a computationally efficient variant of the classical trimmed loss, and (2) higher-order variation regularization (HOVR), which imposes smoothness constraints on the prediction function. While traditional robust statistics typically assume low-complexity models such as linear and kernel models, applying TTL alone to modern neural networks may fail to ensure robustness, as their high expressive power allows them to fit both inliers and outliers, even when a robust loss is used. To address this, we revisit the traditional notion of breakdown point and adapt it to the nonlinear function setting, introducing a regularization scheme via HOVR that controls the model's capacity and suppresses overfitting to outliers. We theoretically establish that our training procedure retains a high functional breakdown point, thereby ensuring robustness to outlier contamination. We develop a stochastic optimization algorithm tailored to this framework and provide a theoretical guarantee of its convergence.

Outlier-robust neural network training: variation regularization meets trimmed loss to prevent functional breakdown

TL;DR

The paper tackles outlier robustness in highly expressive neural networks by introducing a transformed trimmed loss (TTL) combined with higher-order variation regularization (HOVR), forming the augmented and regularized TTL (ARTL) objective. It redefines robustness through a functional breakdown point and proves a guarantee , ensuring tolerance to up to outliers. Optimization is achieved via stochastic gradient–supergradient descent (SGSD) on the ARTL objective, with an unbiased gradient estimator for the HOVR term and convergence guarantees under standard stochastic optimization assumptions. Empirical results on synthetic nonlinear functions and UCI benchmarks show that TTL+HOVR yields superior robustness and fidelity to the underlying signal, outperforming baselines based on robust losses and RANSAC while preserving expressive power. The work provides a scalable, theoretically groundedPath for robust neural network training in contaminated data scenarios."

Abstract

In this study, we tackle the challenge of outlier-robust predictive modeling using highly expressive neural networks. Our approach integrates two key components: (1) a transformed trimmed loss (TTL), a computationally efficient variant of the classical trimmed loss, and (2) higher-order variation regularization (HOVR), which imposes smoothness constraints on the prediction function. While traditional robust statistics typically assume low-complexity models such as linear and kernel models, applying TTL alone to modern neural networks may fail to ensure robustness, as their high expressive power allows them to fit both inliers and outliers, even when a robust loss is used. To address this, we revisit the traditional notion of breakdown point and adapt it to the nonlinear function setting, introducing a regularization scheme via HOVR that controls the model's capacity and suppresses overfitting to outliers. We theoretically establish that our training procedure retains a high functional breakdown point, thereby ensuring robustness to outlier contamination. We develop a stochastic optimization algorithm tailored to this framework and provide a theoretical guarantee of its convergence.
Paper Structure (37 sections, 4 theorems, 34 equations, 21 figures, 2 tables)

This paper contains 37 sections, 4 theorems, 34 equations, 21 figures, 2 tables.

Key Result

Theorem 3

Let $f_{\hat{\theta}}$ be the prediction model trained by minimzing the regularized trimmed loss eq:HOV-regularized-TTL. The following holds:

Figures (21)

  • Figure 1: Linear+Huber
  • Figure 2: NN+Huber
  • Figure 3: NN+Tukey
  • Figure 4: Ours: NN+ARTL
  • Figure 6: True
  • ...and 16 more figures

Theorems & Definitions (10)

  • Definition 1
  • Definition 2
  • Theorem 3
  • Theorem 4
  • Corollary 1
  • Lemma 5
  • proof
  • proof : Proof of Theorem \ref{['thm:bound-SGSD']} (i)
  • proof : Proof of Theorem \ref{['thm:bound-SGSD']} (ii)
  • proof