Adversarial Robustness of Nonparametric Regression
Parsa Moradi, Hanzaleh Akabrinodehi, Mohammad Ali Maddah-Ali
TL;DR
The paper studies adversarial robustness in nonparametric regression for functions in the second-order Sobolev space $\mathcal{W}^2(\Omega)$ when up to $q$ samples can be corrupted. It analyzes the classical smoothing spline estimator with regularization parameter $\lambda$ and derives an upper bound on the adversarial risk, along with a minimax lower bound that shows no estimator can achieve vanishing error when $q = \Theta(n)$. The results establish that smoothing splines remain robust for $q = o(n)$, achieving minimax-optimal performance in terms of the maximum tolerable corruption, and reveal a fundamental limitation in the linear-corruption regime. Experimental results corroborate the theory under various designs and attack strategies, illustrating substantial robustness and the practical relevance of smoothing splines in adversarial scenarios.
Abstract
In this paper, we investigate the adversarial robustness of nonparametric regression, a fundamental problem in machine learning, under the setting where an adversary can arbitrarily corrupt a subset of the input data. While the robustness of parametric regression has been extensively studied, its nonparametric counterpart remains largely unexplored. We characterize the adversarial robustness in nonparametric regression, assuming the regression function belongs to the second-order Sobolev space (i.e., it is square integrable up to its second derivative). The contribution of this paper is two-fold: (i) we establish a minimax lower bound on the estimation error, revealing a fundamental limit that no estimator can overcome, and (ii) we show that, perhaps surprisingly, the classical smoothing spline estimator, when properly regularized, exhibits robustness against adversarial corruption. These results imply that if $o(n)$ out of $n$ samples are corrupted, the estimation error of the smoothing spline vanishes as $n \to \infty$. On the other hand, when a constant fraction of the data is corrupted, no estimator can guarantee vanishing estimation error, implying the optimality of the smoothing spline in terms of maximum tolerable number of corrupted samples.
