Table of Contents
Fetching ...

Coordinate Descent Algorithm for Least Absolute Deviations Regression

Zehaan Naik, Debasis Kundu

Abstract

Least Absolute Deviations (LAD) regression provides a robust alternative to ordinary least squares by minimizing the sum of absolute residuals. However, its widespread use has been limited by the computational cost of existing solvers, particularly simplex-based methods in high-dimensional settings. We propose a coordinate descent algorithm for LAD regression that avoids matrix inversion, naturally accommodates the non-differentiability of the objective function, and remains well-defined even when the number of predictors exceeds the number of observations. The key observation is that each coordinate update reduces to a one-dimensional minimization admitting a closed-form solution given by a median or weighted median. The resulting algorithm has per-iteration complexity $O(p\,n \log n)$ and is provably convergent due to the convexity of the LAD objective and the exactness of each coordinate update. Experiments on synthetic and real datasets show that the method matches the accuracy of linear-programming-based LAD solvers while offering improved scalability and stability in high-dimensional regimes, including cases where $p \ge n$. The method is easy to implement, requires no specialized optimization software, and provides a practical tool for robust linear models.

Coordinate Descent Algorithm for Least Absolute Deviations Regression

Abstract

Least Absolute Deviations (LAD) regression provides a robust alternative to ordinary least squares by minimizing the sum of absolute residuals. However, its widespread use has been limited by the computational cost of existing solvers, particularly simplex-based methods in high-dimensional settings. We propose a coordinate descent algorithm for LAD regression that avoids matrix inversion, naturally accommodates the non-differentiability of the objective function, and remains well-defined even when the number of predictors exceeds the number of observations. The key observation is that each coordinate update reduces to a one-dimensional minimization admitting a closed-form solution given by a median or weighted median. The resulting algorithm has per-iteration complexity and is provably convergent due to the convexity of the LAD objective and the exactness of each coordinate update. Experiments on synthetic and real datasets show that the method matches the accuracy of linear-programming-based LAD solvers while offering improved scalability and stability in high-dimensional regimes, including cases where . The method is easy to implement, requires no specialized optimization software, and provides a practical tool for robust linear models.
Paper Structure (24 sections, 3 theorems, 22 equations, 7 figures, 7 tables, 2 algorithms)

This paper contains 24 sections, 3 theorems, 22 equations, 7 figures, 7 tables, 2 algorithms.

Key Result

Lemma 1

Let Then $L$ is convex in $\beta$.

Figures (7)

  • Figure 1: performance of the LAD coordinate descent algorithm on synthetic data. Left: fitted regression line obtained by LAD-CD. Right: evolution of parameter MAE and prediction MAE over iterations.
  • Figure 2: Distribution of final prediction MAE across 1000 independent replications, illustrating the convergence stability of the optimized LAD-CD algorithm.
  • Figure 3: Fitted regression lines for LAD-CD (red), OLS (green dashed), and QuantReg (purple dotted) on the outlier-contaminated dataset (left), and convergence of parameter and prediction MAE for LAD-CD across iterations (right).
  • Figure 4: Warm-start results on outlier-contaminated data. Left: Initial Ridge and GA fits. Right: LAD–CD refinement substantially reduces bias and moves the fitted line closer to the true signal.
  • Figure 5: [Left] Predicted vs. true median house values for the Boston Housing dataset. Both LAD-CD (blue) and quantreg (green) produce similar fits, illustrating the consistency of our method. [Right] MAE loss over iterations for LAD-CD compared to quantreg. The coordinate descent method converges monotonically toward the optimal loss achieved by the LP-based solver.
  • ...and 2 more figures

Theorems & Definitions (5)

  • Lemma 1: Convexity of the LAD objective
  • Corollary 1: Convexity of coordinate subproblems
  • Theorem 1: Global convergence of LAD-CD
  • proof : Proof of Lemma \ref{['lem: Convex LAD']}
  • proof : Proof of Theorem \ref{['THM: LAD_CD-Convergence']}