Table of Contents
Fetching ...

Alternating Iteratively Reweighted $\ell_1$ and Subspace Newton Algorithms for Nonconvex Sparse Optimization

Hao Wang, Xiangyu Yang, Yichen Zhu

TL;DR

The paper tackles nonconvex sparse optimization by formulating $\min_x f(x)+\lambda h(x)$ with $h(x)=\sum_i (r\circ|\cdot|)(x_i)$ and proposing IReNA, a hybrid algorithm that alternates subspace iteratively reweighted $\ell_1$ steps with subspace Newton updates. It introduces two optimality residuals to identify and exploit a relevant subspace, guaranteeing closed-form subproblem solutions via soft-thresholding and accelerating convergence through Newton steps on the active set. The authors prove global convergence to a critical point, establish local convergence rates under the KL property, and show quadratic convergence when exact Newton steps are used; they also discuss a trust-region variant with a similar local complexity bound. Numerical experiments on logistic regression with various nonconvex regularizers and real datasets demonstrate improved efficiency and high-quality sparse solutions compared to state-of-the-art hybrids. Overall, IReNA offers a scalable, theoretically sound framework for a broad class of nonconvex sparse regularizers with strong practical performance.

Abstract

This paper presents a novel hybrid algorithm for minimizing the sum of a continuously differentiable loss function and a nonsmooth, possibly nonconvex, sparse regularization function. The proposed method alternates between solving a reweighted $\ell_1$-regularized subproblem and performing an inexact subspace Newton step. The reweighted $\ell_1$-subproblem allows for efficient closed-form solutions via the soft-thresholding operator, avoiding the computational overhead of proximity operator calculations. As the algorithm approaches an optimal solution, it maintains a stable support set, ensuring that nonzero components stay uniformly bounded away from zero. It then switches to a perturbed regularized Newton method, further accelerating the convergence. We prove global convergence to a critical point and, under suitable conditions, demonstrate that the algorithm exhibits local linear and quadratic convergence rates. Numerical experiments show that our algorithm outperforms existing methods in both efficiency and solution quality across various model prediction problems.

Alternating Iteratively Reweighted $\ell_1$ and Subspace Newton Algorithms for Nonconvex Sparse Optimization

TL;DR

The paper tackles nonconvex sparse optimization by formulating with and proposing IReNA, a hybrid algorithm that alternates subspace iteratively reweighted steps with subspace Newton updates. It introduces two optimality residuals to identify and exploit a relevant subspace, guaranteeing closed-form subproblem solutions via soft-thresholding and accelerating convergence through Newton steps on the active set. The authors prove global convergence to a critical point, establish local convergence rates under the KL property, and show quadratic convergence when exact Newton steps are used; they also discuss a trust-region variant with a similar local complexity bound. Numerical experiments on logistic regression with various nonconvex regularizers and real datasets demonstrate improved efficiency and high-quality sparse solutions compared to state-of-the-art hybrids. Overall, IReNA offers a scalable, theoretically sound framework for a broad class of nonconvex sparse regularizers with strong practical performance.

Abstract

This paper presents a novel hybrid algorithm for minimizing the sum of a continuously differentiable loss function and a nonsmooth, possibly nonconvex, sparse regularization function. The proposed method alternates between solving a reweighted -regularized subproblem and performing an inexact subspace Newton step. The reweighted -subproblem allows for efficient closed-form solutions via the soft-thresholding operator, avoiding the computational overhead of proximity operator calculations. As the algorithm approaches an optimal solution, it maintains a stable support set, ensuring that nonzero components stay uniformly bounded away from zero. It then switches to a perturbed regularized Newton method, further accelerating the convergence. We prove global convergence to a critical point and, under suitable conditions, demonstrate that the algorithm exhibits local linear and quadratic convergence rates. Numerical experiments show that our algorithm outperforms existing methods in both efficiency and solution quality across various model prediction problems.
Paper Structure (24 sections, 16 theorems, 72 equations, 3 figures, 4 tables, 3 algorithms)

This paper contains 24 sections, 16 theorems, 72 equations, 3 figures, 4 tables, 3 algorithms.

Key Result

lemma thmcounterlemma

Consider eq.relaxedlp and eq.Gk. For any $(x^k, \epsilon^k), (x^{k+1}, \epsilon^{k+1}) \in \mathbb{R}^n \times \mathbb{R}_{++}^n$ , it holds for any $\epsilon^{k+1} \le \epsilon^k$ that Moreover, the following statements are equivalent:

Figures (3)

  • Figure 1: Illustration of the local quadratic convergence of IReNA.
  • Figure 2: Comparison with HpgSRN for $p = 0.3$ on synthetic datasets. The $x$-axis represents the feature size, while the $y$-axis shows the ratio of CPU time for the same problem with $p = 0.3$ to that with $p = 0.5$.
  • Figure 3: Convergence behavior for other nonconvex regularizers on real-world datasets.

Theorems & Definitions (32)

  • definition thmcounterdefinition: Stationary point wang2021nonconvex
  • lemma thmcounterlemma
  • proof
  • proposition thmcounterproposition
  • proof
  • lemma thmcounterlemma
  • proof
  • lemma thmcounterlemma
  • lemma thmcounterlemma
  • proof
  • ...and 22 more