Robust Regression under Adversarial Contamination: Theory and Algorithms for the Welsch Estimator

Ilyes Hammouda; Mohamed Ndaoud; Abd-Krim Seghouane

Robust Regression under Adversarial Contamination: Theory and Algorithms for the Welsch Estimator

Ilyes Hammouda, Mohamed Ndaoud, Abd-Krim Seghouane

TL;DR

The paper tackles robust regression under adversarial contamination by analyzing a smooth redescending M-estimator, the Welsch loss. It introduces a two-stage algorithm with LAD initialization, and proves that the loss landscape is locally convex within a basin of attraction, enabling provable recovery of the estimator. The authors establish non-asymptotic minimax deviation bounds and bias reduction under contamination, along with asymptotic normality and efficiency, supported by extensive simulations and real-data experiments. Collectively, the work demonstrates that the Welsch estimator can achieve reduced outlier bias, improved robustness, and statistical efficiency in high-dimensional linear regression despite adversarial perturbations.

Abstract

Convex and penalized robust regression methods often suffer from a persistent bias induced by large outliers, limiting their effectiveness in adversarial or heavy-tailed settings. In this work, we study a smooth redescending non-convex M-estimator, specifically the Welsch estimator, and show that it can eliminate this bias whenever it is statistically identifiable. We focus on high-dimensional linear regression under adversarial contamination, where a fraction of samples may be corrupted by an adversary with full knowledge of the data and underlying model. A central technical contribution of this paper is a practical algorithm that provably finds a statistically valid solution to this non-convex problem. We show that the Welsch objective remains locally convex within a well-characterized basin of attraction, and our algorithm is guaranteed to converge into this region and recover the desired estimator. We establish three main guarantees: (a) non-asymptotic minimax-optimal deviation bounds under contamination, (b) improved unbiasedness in the presence of large outliers, and (c) asymptotic normality, yielding statistical efficiency as the sample size grows. Finally, we support our theoretical findings with comprehensive experiments on synthetic and real datasets, demonstrating the estimator's superior robustness, efficiency, and effectiveness in mitigating outlier-induced bias relative to state-of-the-art robust regression methods.

Robust Regression under Adversarial Contamination: Theory and Algorithms for the Welsch Estimator

TL;DR

Abstract

Paper Structure (23 sections, 12 theorems, 82 equations, 8 figures, 1 algorithm)

This paper contains 23 sections, 12 theorems, 82 equations, 8 figures, 1 algorithm.

Introduction
Related work
Contributions
Adversarial contamination in linear regression
Notations
On the theoretical properties of the Welsch estimator
On the loss landscape
Non-asymptotic optimality
Asymptotic efficiency
Numerical experiments
Simulated data
Real-world data
Technical results
Proof of Theorem \ref{['conv']}
Proof of proposition \ref{['convex_set']}
...and 8 more sections

Key Result

Theorem 1

Let the function $f(.)$ be defined as follows: where $\beta \in \mathbf{R}^{p}$ is a parameter vector, and observations $(Y_1,X_1),\cdots (Y_n,X_n)$ satisfy Assumptions assmp1-assmp3. Then the function $f$ is strictly convex, with probability at least $1 - 2 \exp(-Dn/C^2)$, on the set $\mathscr{O}_{\tau}$ where the constant $D$ satisfies the inequality: for a sufficiently large $C$.

Figures (8)

Figure 1: Comparison of the Euclidean norm of the bias of the Welsch estimator against other robust estimators.
Figure 2: Comparison between Huber and Welsch loss functions.
Figure 3: Landscape of the empirical Welsch loss function.
Figure 4: Comparison of the speed of convergence between Huber and Welsch estimators.
Figure 5: Distribution of the MSE of Huber and Welsch estimators under corruption.
...and 3 more figures

Theorems & Definitions (17)

Remark 1
Theorem 1
Remark 2
Proposition 1
Remark 3
Theorem 2
Proposition 2
Remark 4
Theorem 3
Theorem 4
...and 7 more

Robust Regression under Adversarial Contamination: Theory and Algorithms for the Welsch Estimator

TL;DR

Abstract

Robust Regression under Adversarial Contamination: Theory and Algorithms for the Welsch Estimator

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (17)