The Influence Function of Penalized Regression Estimators
Viktoria Öllerer, Christophe Croux, Andreas Alfons
TL;DR
The paper analyzes robustness of penalized regression estimators through influence functions, deriving the influence function, asymptotic variance, and MSE for penalized M-estimators and the sparse LTS. It shows that robustness depends on using a loss function with bounded derivative; the lasso and ridge can possess unbounded influence, while sparse LTS and bounded-loss M-estimators bound the influence from leverage points and outliers. Through theoretical results, plots, and simulations, it compares asymptotic efficiency and finite-sample sensitivity, recommending sparse LTS when leverage contamination is likely and L1-penalized Huber when vertical outliers dominate. The findings provide practical guidance for selecting robust penalties and loss functions in high-dimensional regression contexts with outliers.
Abstract
To perform regression analysis in high dimensions, lasso or ridge estimation are a common choice. However, it has been shown that these methods are not robust to outliers. Therefore, alternatives as penalized M-estimation or the sparse least trimmed squares (LTS) estimator have been proposed. The robustness of these regression methods can be measured with the influence function. It quantifies the effect of infinitesimal perturbations in the data. Furthermore it can be used to compute the asymptotic variance and the mean squared error. In this paper we compute the influence function, the asymptotic variance and the mean squared error for penalized M-estimators and the sparse LTS estimator. The asymptotic biasedness of the estimators make the calculations nonstandard. We show that only M-estimators with a loss function with a bounded derivative are robust against regression outliers. In particular, the lasso has an unbounded influence function.
