Table of Contents
Fetching ...

Adversarial Robustness of Nonparametric Regression

Parsa Moradi, Hanzaleh Akabrinodehi, Mohammad Ali Maddah-Ali

TL;DR

The paper studies adversarial robustness in nonparametric regression for functions in the second-order Sobolev space $\mathcal{W}^2(\Omega)$ when up to $q$ samples can be corrupted. It analyzes the classical smoothing spline estimator with regularization parameter $\lambda$ and derives an upper bound on the adversarial risk, along with a minimax lower bound that shows no estimator can achieve vanishing error when $q = \Theta(n)$. The results establish that smoothing splines remain robust for $q = o(n)$, achieving minimax-optimal performance in terms of the maximum tolerable corruption, and reveal a fundamental limitation in the linear-corruption regime. Experimental results corroborate the theory under various designs and attack strategies, illustrating substantial robustness and the practical relevance of smoothing splines in adversarial scenarios.

Abstract

In this paper, we investigate the adversarial robustness of nonparametric regression, a fundamental problem in machine learning, under the setting where an adversary can arbitrarily corrupt a subset of the input data. While the robustness of parametric regression has been extensively studied, its nonparametric counterpart remains largely unexplored. We characterize the adversarial robustness in nonparametric regression, assuming the regression function belongs to the second-order Sobolev space (i.e., it is square integrable up to its second derivative). The contribution of this paper is two-fold: (i) we establish a minimax lower bound on the estimation error, revealing a fundamental limit that no estimator can overcome, and (ii) we show that, perhaps surprisingly, the classical smoothing spline estimator, when properly regularized, exhibits robustness against adversarial corruption. These results imply that if $o(n)$ out of $n$ samples are corrupted, the estimation error of the smoothing spline vanishes as $n \to \infty$. On the other hand, when a constant fraction of the data is corrupted, no estimator can guarantee vanishing estimation error, implying the optimality of the smoothing spline in terms of maximum tolerable number of corrupted samples.

Adversarial Robustness of Nonparametric Regression

TL;DR

The paper studies adversarial robustness in nonparametric regression for functions in the second-order Sobolev space when up to samples can be corrupted. It analyzes the classical smoothing spline estimator with regularization parameter and derives an upper bound on the adversarial risk, along with a minimax lower bound that shows no estimator can achieve vanishing error when . The results establish that smoothing splines remain robust for , achieving minimax-optimal performance in terms of the maximum tolerable corruption, and reveal a fundamental limitation in the linear-corruption regime. Experimental results corroborate the theory under various designs and attack strategies, illustrating substantial robustness and the practical relevance of smoothing splines in adversarial scenarios.

Abstract

In this paper, we investigate the adversarial robustness of nonparametric regression, a fundamental problem in machine learning, under the setting where an adversary can arbitrarily corrupt a subset of the input data. While the robustness of parametric regression has been extensively studied, its nonparametric counterpart remains largely unexplored. We characterize the adversarial robustness in nonparametric regression, assuming the regression function belongs to the second-order Sobolev space (i.e., it is square integrable up to its second derivative). The contribution of this paper is two-fold: (i) we establish a minimax lower bound on the estimation error, revealing a fundamental limit that no estimator can overcome, and (ii) we show that, perhaps surprisingly, the classical smoothing spline estimator, when properly regularized, exhibits robustness against adversarial corruption. These results imply that if out of samples are corrupted, the estimation error of the smoothing spline vanishes as . On the other hand, when a constant fraction of the data is corrupted, no estimator can guarantee vanishing estimation error, implying the optimality of the smoothing spline in terms of maximum tolerable number of corrupted samples.

Paper Structure

This paper contains 9 sections, 8 theorems, 94 equations, 6 figures.

Key Result

Theorem 1

Let $f \in \mathcal{W}^2(\Omega)$, and let $\hat{f}^{\,a}_{\mathrm{SS}}$ denote the smoothing spline estimator defined in Definition:SS. Let $M = \max\{m_1, m_2\}$. Assume that $\lambda \to 0$ as $n \to \infty$ and $\lambda > n^{-2}$. Then, for sufficiently large $n$, we have and also

Figures (6)

  • Figure 1: Rates of convergence for estimation error $R_2 (f, \hat{f})$ and $R_\infty (f, \hat{f})$, as $n \to\infty$, and for any $f$ belongs second-order Sobolev space (for non-asymptotic analysis, see Theorems \ref{['thm:upper']} and \ref{['thm:lower']}). The blue curves represent the minimum rate achieved by the smoothing spline estimator. The red curves denote minimax outer bounds that are impossible to beat. Specifically, for $q = o(n)$, for the smoothing spline estimator, both $R_2$ and $R_\infty$ converge to zero as $n \to \infty$. When $q = \Theta(n)$, we show that no estimator can achieve vanishing error, establishing a fundamental limit on robustness. This result highlights that smoothing splines are optimal in terms of the maximum tolerable number of adversarial corruptions (see Corollary \ref{['cor:optimality of smoothing']}).
  • Figure 2: Construction of functions $f_1$ (blue) and $f_2$ (red) used in Theorem \ref{['thm:lower']}. Both functions belong to $\mathcal{W}^2([0,1])$, where $f_1(x) = 0$ for all $x$, and $f_2(x)$ differs from $f_1$ only on the interval $[0, r_q]$, with $r_q = q/n$. The function $f_2$ is linear on $[0, r_q - \varepsilon_q]$, where $\varepsilon_q = r_q^2$, and transitions smoothly to zero on $[r_q - \varepsilon_q, r_q]$ via a degree-5 polynomial, ensuring $f_2 \in \mathcal{W}^2([0,1])$. This construction induces a non-zero gap in both $L_2$ and $L_\infty$ norms, while enabling the adversary to obscure the difference by corrupting only $q$ samples, and making $f_1,f_2$ statistically indistinguishable. The details of this construction is provided in Appendix \ref{['Proof:thm:lower']}.
  • Figure 3: Log-log plots of error convergence rates for the cubic smoothing spline estimator $\hat{f} = \hat{f}^a_{\mathrm{SS}}$ for $f(x) = x\sin(x)$ in the uniform design setting, where the input points converge to a uniform distribution. The top row shows $R_2(f, \hat{f})$ and $R_\infty(f, \hat{f})$ errors for $q = n^{0.3}$, along with the corresponding theoretical upper bounds of $\mathcal{O}(n^{-0.8})$ and $\mathcal{O}(n^{-0.6})$, respectively. The bottom row presents $R_2(f, \hat{f})$ and $R_\infty(f, \hat{f})$ errors for $q = n^{0.6}$, with theoretical upper bounds of $\mathcal{O}(n^{-0.53})$ and $\mathcal{O}(n^{-0.48})$, respectively.
  • Figure 4: Log–log plots showing the convergence behavior of the cubic smoothing spline estimator $\hat{f} = \hat{f}^{\,a}_{\mathrm{SS}}$ when the ground-truth function is the MLP network, under the uniform design setting. The top row corresponds to the case $q = n^{0.3}$, with theoretical convergence rates of $\mathcal{O}(n^{-0.8})$ for $R_2(f, \hat{f})$ and $\mathcal{O}(n^{-0.6})$ for $R_\infty(f, \hat{f})$. The bottom row shows results for a higher corruption level, $q = n^{0.6}$, with respective theoretical upper bounds of $\mathcal{O}(n^{-0.53})$ and $\mathcal{O}(n^{-0.48})$.
  • Figure 5: Log-log plots showing the convergence rate of the cubic smoothing spline estimator $\hat{f} = \hat{f}^a_{\mathrm{SS}}$ for $f(x) = x\sin(x)$ under a Gaussian design. The top row plots are results for $q = n^{0.3}$, with theoretical rates of $\mathcal{O}(n^{-0.8})$ for $R_2(f, \hat{f})$ and $\mathcal{O}(n^{-0.6})$ for $R_\infty(f, \hat{f})$. The bottom row corresponds to a higher corruption level, $q = n^{0.6}$, with respective theoretical upper bounds of $\mathcal{O}(n^{-0.53})$ and $\mathcal{O}(n^{-0.48})$.
  • ...and 1 more figures

Theorems & Definitions (9)

  • Theorem 1: Upper Bound
  • Corollary 1: Convergence Rate of $R_2(f, \hat{f})$
  • Corollary 2: Convergence Rate of $R_\infty(f, \hat{f})$
  • Theorem 2
  • Corollary 3
  • Corollary 4: On optimality of Smoothing Spline
  • Lemma 1
  • Lemma 2
  • proof