Table of Contents
Fetching ...

Improved $\ell_{p}$ Regression via Iteratively Reweighted Least Squares

Alina Ene, Ta Duy Nguyen, Adrian Vladu

TL;DR

This paper tackles fast, accurate $\ell_p$ regression via iteratively reweighted least squares (IRLS) using a primal-dual framework. It introduces two IRLS-based algorithms: a low-precision poly$(1/\epsilon)$ scheme and a high-precision $\log(1/\epsilon)$ scheme, achieving state-of-the-art iteration bounds with a simpler, lightweight solver. The methods hinge on a dual energy $\mathcal{E}(r)$ and an invariant-based coordinate update to drive dual progress and recover primal solutions, complemented by iterative refinement through a ResidualSolver for high accuracy. Empirical results demonstrate substantial improvements over prior IRLS and CVX solvers across synthetic and real-world data, highlighting better scalability and faster convergence for $\ell_p$ regression.

Abstract

We introduce fast algorithms for solving $\ell_{p}$ regression problems using the iteratively reweighted least squares (IRLS) method. Our approach achieves state-of-the-art iteration complexity, outperforming the IRLS algorithm by Adil-Peng-Sachdeva (NeurIPS 2019) and matching the theoretical bounds established by the complex algorithm of Adil-Kyng-Peng-Sachdeva (SODA 2019, J. ACM 2024) via a simpler lightweight iterative scheme. This bridges the existing gap between theoretical and practical algorithms for $\ell_{p}$ regression. Our algorithms depart from prior approaches, using a primal-dual framework, in which the update rule can be naturally derived from an invariant maintained for the dual objective. Empirically, we show that our algorithms significantly outperform both the IRLS algorithm by Adil-Peng-Sachdeva and MATLAB/CVX implementations.

Improved $\ell_{p}$ Regression via Iteratively Reweighted Least Squares

TL;DR

This paper tackles fast, accurate regression via iteratively reweighted least squares (IRLS) using a primal-dual framework. It introduces two IRLS-based algorithms: a low-precision poly scheme and a high-precision scheme, achieving state-of-the-art iteration bounds with a simpler, lightweight solver. The methods hinge on a dual energy and an invariant-based coordinate update to drive dual progress and recover primal solutions, complemented by iterative refinement through a ResidualSolver for high accuracy. Empirical results demonstrate substantial improvements over prior IRLS and CVX solvers across synthetic and real-world data, highlighting better scalability and faster convergence for regression.

Abstract

We introduce fast algorithms for solving regression problems using the iteratively reweighted least squares (IRLS) method. Our approach achieves state-of-the-art iteration complexity, outperforming the IRLS algorithm by Adil-Peng-Sachdeva (NeurIPS 2019) and matching the theoretical bounds established by the complex algorithm of Adil-Kyng-Peng-Sachdeva (SODA 2019, J. ACM 2024) via a simpler lightweight iterative scheme. This bridges the existing gap between theoretical and practical algorithms for regression. Our algorithms depart from prior approaches, using a primal-dual framework, in which the update rule can be naturally derived from an invariant maintained for the dual objective. Empirically, we show that our algorithms significantly outperform both the IRLS algorithm by Adil-Peng-Sachdeva and MATLAB/CVX implementations.

Paper Structure

This paper contains 28 sections, 28 theorems, 85 equations, 5 figures, 1 table, 4 algorithms.

Key Result

Theorem 1.1

For any $p\geq2,$ there is an iterative algorithm for the $\ell_{p}$ regression problem $\min_{x\in\mathbb{\mathbb{R}}^{n}\colon Ax=b}\|x\|_{p}$ that solves $O\left(\log\log n+\log\left(1/\epsilon\right)\right)$ subproblems, each of which makes $O(((\frac{1}{\epsilon})^{\frac{2p-3}{p-2}}+n^{\frac{p-

Figures (5)

  • Figure 1: Performance on random matrices: $\min\left\Vert Ax-b\right\Vert _{p}^{p}$ with $\epsilon=10^{-10}$. We compare our algorithm with CVX using SDPT3 and SeDuMi solvers and $p$-IRLS by adil2019fast. Figures (a),(b),(e),(f) plot the average and standard deviation of number of iterations and time taken by the solvers to find a solution over 10 runs. Figures (c),(d),(g),(h) measure over 5 runs.
  • Figure 2: Performance on random graph instances: $\min\left\Vert Ax-b\right\Vert _{p}^{p}$ with $\epsilon=10^{-10}$. We compare our algorithm with CVX using SDPT3 and SeDuMi solvers and $p$-IRLS by adil2019fast. Figures (a),(b),(e),(f) measure over 10 runs. Figures (c),(d),(g),(h) measure over 5 runs.
  • Figure 3: Error of the solution against CVX/SDPT3 solution in log10 scale.
  • Figure 4: Performance when varying $\epsilon$ on random matrices and random graphs instances.
  • Figure 5: Performance when $p=1.1$ and $p=1.9$ on random matrices of size $n\times(n-100)$.

Theorems & Definitions (54)

  • Remark 1.1
  • Theorem 1.1
  • Remark 1.2
  • Theorem 1.2
  • Remark 4.1
  • Definition A.1
  • Lemma A.1
  • Lemma A.2
  • proof
  • Lemma B.1
  • ...and 44 more