Table of Contents
Fetching ...

A linesearch-type normal map-based semismooth Newton method for nonsmooth nonconvex composite optimization

Hanfeng Zeng, Wenqing Ouyang, Andre Milzarek

TL;DR

This paper develops a linesearch-type normal-map semismooth Newton method for nonsmooth, nonconvex composite optimization of the form $\psi(x)=f(x)+\varphi(x)$, where $f$ is smooth and $\varphi$ is convex and lsc. It introduces an adaptive Lipschitz estimation strategy to avoid explicit $L$ computations and proves global convergence, convergence under the Kurdyka-Łojasiewicz property, and local q-superlinear convergence. The algorithm relies on a symmetric linear system derived from the normal map via $F^{\lambda}_{\mathrm{nor}}(z)$ and a CG-based inexact solve, with a linesearch ensuring descent of a merit function $H(\tau,z)$ and a smooth transition to fast local convergence. Numerical experiments on sparse logistic regression, nonlinear image compression, and group-sparse nonlinear least squares demonstrate competitive performance and robustness, highlighting practical impact for large-scale nonsmooth optimization.

Abstract

We propose a novel linesearch variant of the trust region normal map-based semismooth Newton method developed in [Ouyang and Milzarek, Math. Program. 212(1-2), 389--435 (2025)] for solving a class of nonsmooth, nonconvex composite-type optimization problems. Our approach uses adaptive parameter estimation techniques, which allow us to avoid explicit and potentially expensive Lipschitz constant computations. We provide extensive convergence results including global convergence, convergence of the iterates under the Kurdyka-Łojasiewicz inequality, and transition to fast local q-superlinear convergence. Compared to the original trust region framework, the linesearch-based algorithm is simpler and the overall convergence analysis can be conducted under weaker assumptions -- in particular, without requiring explicit boundedness conditions on the Hessian approximations and iterates. Numerical experiments on sparse logistic regression, image compression, and nonlinear least squares with group penalty terms demonstrate the efficiency of the proposed approach.

A linesearch-type normal map-based semismooth Newton method for nonsmooth nonconvex composite optimization

TL;DR

This paper develops a linesearch-type normal-map semismooth Newton method for nonsmooth, nonconvex composite optimization of the form , where is smooth and is convex and lsc. It introduces an adaptive Lipschitz estimation strategy to avoid explicit computations and proves global convergence, convergence under the Kurdyka-Łojasiewicz property, and local q-superlinear convergence. The algorithm relies on a symmetric linear system derived from the normal map via and a CG-based inexact solve, with a linesearch ensuring descent of a merit function and a smooth transition to fast local convergence. Numerical experiments on sparse logistic regression, nonlinear image compression, and group-sparse nonlinear least squares demonstrate competitive performance and robustness, highlighting practical impact for large-scale nonsmooth optimization.

Abstract

We propose a novel linesearch variant of the trust region normal map-based semismooth Newton method developed in [Ouyang and Milzarek, Math. Program. 212(1-2), 389--435 (2025)] for solving a class of nonsmooth, nonconvex composite-type optimization problems. Our approach uses adaptive parameter estimation techniques, which allow us to avoid explicit and potentially expensive Lipschitz constant computations. We provide extensive convergence results including global convergence, convergence of the iterates under the Kurdyka-Łojasiewicz inequality, and transition to fast local q-superlinear convergence. Compared to the original trust region framework, the linesearch-based algorithm is simpler and the overall convergence analysis can be conducted under weaker assumptions -- in particular, without requiring explicit boundedness conditions on the Hessian approximations and iterates. Numerical experiments on sparse logistic regression, image compression, and nonlinear least squares with group penalty terms demonstrate the efficiency of the proposed approach.
Paper Structure (19 sections, 11 theorems, 75 equations, 8 figures, 2 tables, 3 algorithms)

This paper contains 19 sections, 11 theorems, 75 equations, 8 figures, 2 tables, 3 algorithms.

Key Result

Lemma 2.1

Let $M = BD+(I-D)/\lambda$ be given and let $B, D \in \mathbb{R}^{n \times n}$ be symmetric. Suppose that the CG method dembo1983truncated is run with tolerance parameter $\epsilon \geq 0$ to solve $D M \tilde{q}= -DF^{\lambda}_{\mathrm{nor}}(z)$ and define $m = \mathrm{dim}~\mathrm{range}(D)$. Then

Figures (8)

  • Figure 1: Change of the relative error $\texttt{rel$\_$err}$ with respect to the cpu-time for solving the $\ell_1$-logistic regression problem \ref{['eq:logreg-prob']}.
  • Figure 2: Change of the relative error $\texttt{rel$\_$err}$ with respect to the number of iterations for solving the $\ell_1$-logistic regression problem \ref{['eq:logreg-prob']}. We mark the last four iterations of TRSSN-H and LSSSN-H with $\circ$ to illustrate local superlinear convergence.
  • Figure 3: Comparison of LSSSN and LSSSN-H on CINA and rcv1 for different choices of $\epsilon_k = \min\{\chi(z_k)^{a},b\}$, where $(a,b) \in\{(1.5,0.1),(2,0.05),(2.5,0.01),(3,0.001)\}$.
  • Figure 4: Numerical comparison of iPiano, SpaRSA, TRSSN, and LSSSN on problem \ref{['eq:diff']}. Plot of the norm of the natural residual with respect to the cpu-time for different images.
  • Figure 5: Numerical comparison of iPiano, SpaRSA, TRSSN, and LSSSN on problem \ref{['eq:diff']}. Plot of the norm of the natural residual with respect to the required cpu-time for different $\mu$. The test images in the subfigures (a)-(c) and (d)-(f) are football and city, respectively.
  • ...and 3 more figures

Theorems & Definitions (24)

  • Lemma 2.1
  • Lemma 2.2
  • proof
  • Lemma 3.1
  • proof
  • Theorem 3.3
  • proof
  • Definition 3.4
  • Lemma 3.6
  • proof
  • ...and 14 more