Table of Contents
Fetching ...

An Efficient Dual ADMM for Huber Regression with Fused Lasso Penalty

Mengjiao Shi, Yunhai Xiao

TL;DR

The paper tackles robust linear regression under heavy-tailed errors by uniting the adaptive Huber loss with a fused-lasso penalty to enable simultaneous estimation and variable selection. A dual formulation solved by semi-proximal ADMM (spADMM) yields efficient, closed-form updates and reduced computational burden compared to prior ADMM approaches. The authors establish convergence under standard conditions and demonstrate the method's robustness and scalability through extensive simulated and real-data experiments, outperforming a competing FHADMM method in time, particularly in high-dimensional settings. The results suggest strong practical potential for robust, sparse regression in settings with outliers and sequentially related features.

Abstract

The ordinary least squares estimate in linear regression is sensitive to the influence of errors with large variance, which reduces its robustness, especially when dealing with heavy-tailed errors or outliers frequently encountered in real-world scenarios. To address this issue and accommodate the sparsity of coefficients along with their sequential disparities, we combine the adaptive robust Huber loss function with a fused lasso penalty. This combination yields a robust estimator capable of simultaneously achieving estimation and variable selection. Furthermore, we utilize an efficient alternating direction method of multipliers to solve this regression model from a dual perspective. The effectiveness and efficiency of our proposed approach is demonstrated through numerical experiments carried out on both simulated and real datasets.

An Efficient Dual ADMM for Huber Regression with Fused Lasso Penalty

TL;DR

The paper tackles robust linear regression under heavy-tailed errors by uniting the adaptive Huber loss with a fused-lasso penalty to enable simultaneous estimation and variable selection. A dual formulation solved by semi-proximal ADMM (spADMM) yields efficient, closed-form updates and reduced computational burden compared to prior ADMM approaches. The authors establish convergence under standard conditions and demonstrate the method's robustness and scalability through extensive simulated and real-data experiments, outperforming a competing FHADMM method in time, particularly in high-dimensional settings. The results suggest strong practical potential for robust, sparse regression in settings with outliers and sequentially related features.

Abstract

The ordinary least squares estimate in linear regression is sensitive to the influence of errors with large variance, which reduces its robustness, especially when dealing with heavy-tailed errors or outliers frequently encountered in real-world scenarios. To address this issue and accommodate the sparsity of coefficients along with their sequential disparities, we combine the adaptive robust Huber loss function with a fused lasso penalty. This combination yields a robust estimator capable of simultaneously achieving estimation and variable selection. Furthermore, we utilize an efficient alternating direction method of multipliers to solve this regression model from a dual perspective. The effectiveness and efficiency of our proposed approach is demonstrated through numerical experiments carried out on both simulated and real datasets.
Paper Structure (9 sections, 1 theorem, 23 equations, 2 figures, 2 tables)

This paper contains 9 sections, 1 theorem, 23 equations, 2 figures, 2 tables.

Key Result

Theorem 3.1

(fazel2013hankel) Suppose that the sequence $\{(u^{k}, v^{k}; w^{k})\}$ is generated by spADMM from an initial point $(w^0, v^0; w^0)$. If $\rho\in(0, (1+\sqrt{5})/2)$ and $\eta>0$ is chosen such that $\mathcal{S}$ being positive semi-definite. Then the sequence $\{(u^{k}, v^{k})\}$ converges to an

Figures (2)

  • Figure 1: The performance of Algorithm 1 under different parameters with various dimensions and noise types, where "NT" represents the type of noise with values ranging from 1 to 4 corresponding to the specific noise introduced earlier in (i) to (iv).
  • Figure 2: Estimation performance of Algorithm 1 under different noise and sample size.

Theorems & Definitions (3)

  • Remark 3.1
  • Remark 3.2
  • Theorem 3.1