Table of Contents
Fetching ...

The Influence Function of Penalized Regression Estimators

Viktoria Öllerer, Christophe Croux, Andreas Alfons

TL;DR

The paper analyzes robustness of penalized regression estimators through influence functions, deriving the influence function, asymptotic variance, and MSE for penalized M-estimators and the sparse LTS. It shows that robustness depends on using a loss function with bounded derivative; the lasso and ridge can possess unbounded influence, while sparse LTS and bounded-loss M-estimators bound the influence from leverage points and outliers. Through theoretical results, plots, and simulations, it compares asymptotic efficiency and finite-sample sensitivity, recommending sparse LTS when leverage contamination is likely and L1-penalized Huber when vertical outliers dominate. The findings provide practical guidance for selecting robust penalties and loss functions in high-dimensional regression contexts with outliers.

Abstract

To perform regression analysis in high dimensions, lasso or ridge estimation are a common choice. However, it has been shown that these methods are not robust to outliers. Therefore, alternatives as penalized M-estimation or the sparse least trimmed squares (LTS) estimator have been proposed. The robustness of these regression methods can be measured with the influence function. It quantifies the effect of infinitesimal perturbations in the data. Furthermore it can be used to compute the asymptotic variance and the mean squared error. In this paper we compute the influence function, the asymptotic variance and the mean squared error for penalized M-estimators and the sparse LTS estimator. The asymptotic biasedness of the estimators make the calculations nonstandard. We show that only M-estimators with a loss function with a bounded derivative are robust against regression outliers. In particular, the lasso has an unbounded influence function.

The Influence Function of Penalized Regression Estimators

TL;DR

The paper analyzes robustness of penalized regression estimators through influence functions, deriving the influence function, asymptotic variance, and MSE for penalized M-estimators and the sparse LTS. It shows that robustness depends on using a loss function with bounded derivative; the lasso and ridge can possess unbounded influence, while sparse LTS and bounded-loss M-estimators bound the influence from leverage points and outliers. Through theoretical results, plots, and simulations, it compares asymptotic efficiency and finite-sample sensitivity, recommending sparse LTS when leverage contamination is likely and L1-penalized Huber when vertical outliers dominate. The findings provide practical guidance for selecting robust penalties and loss functions in high-dimensional regression contexts with outliers.

Abstract

To perform regression analysis in high dimensions, lasso or ridge estimation are a common choice. However, it has been shown that these methods are not robust to outliers. Therefore, alternatives as penalized M-estimation or the sparse least trimmed squares (LTS) estimator have been proposed. The robustness of these regression methods can be measured with the influence function. It quantifies the effect of infinitesimal perturbations in the data. Furthermore it can be used to compute the asymptotic variance and the mean squared error. In this paper we compute the influence function, the asymptotic variance and the mean squared error for penalized M-estimators and the sparse LTS estimator. The asymptotic biasedness of the estimators make the calculations nonstandard. We show that only M-estimators with a loss function with a bounded derivative are robust against regression outliers. In particular, the lasso has an unbounded influence function.

Paper Structure

This paper contains 10 sections, 8 theorems, 66 equations, 14 figures.

Key Result

Lemma 3.1

Let $y=x\beta_0 + e$ be a simple regression model as in (eq:ModelF). Let $H_0$ be the joint distribution of $x$ and $y$, with $x$ and $e$ normally distributed. Then the explicit solution of the sparse LTS functional (eq:spLTS) is with $c_1 = \alpha - 2q_\alpha \phi(q_\alpha)$, $q_\alpha$ the $\frac{\alpha+1}{2}$-quantile of the standard normal distribution and $\phi$ its density.

Figures (14)

  • Figure 1: Biweight and Huber loss function $\rho$ and their first derivatives $\psi$.
  • Figure 2: The smoothly clipped absolute deviation (SCAD) penalty function
  • Figure 3: Bias of various functionals for different values of $\beta_0$ ($\lambda = 0.1$ fixed). Note that the small fluctuations are due to Monte Carlo simulations in the computation of the functional.
  • Figure 4: Approximation of $|\beta|$ using $\beta\cdot\tanh(K\beta)$
  • Figure 5: Influence functions for different penalty functions (least squares, ridge, lasso and SCAD) for $\beta_0 = 1.5$ with $(x_0, y_0)\in[-10,10]^2$ and the vertical axis ranging from $-250$ to $100$
  • ...and 9 more figures

Theorems & Definitions (16)

  • Lemma 3.1
  • Proposition 4.1
  • Corollary 4.2
  • Lemma 5.1
  • Lemma 5.2
  • Proposition 5.3
  • Lemma 5.4
  • Lemma 6.1
  • proof : Proof of Equation \ref{['eq:lassoFunct']}
  • proof : Proof of Lemma \ref{['lemma:spLTS_uni']}
  • ...and 6 more