Table of Contents
Fetching ...

Majorization-Minimization-Based Levenberg--Marquardt Method for Constrained Nonlinear Least Squares

Naoki Marumo, Takayuki Okuno, Akiko Takeda

Abstract

A new Levenberg--Marquardt (LM) method for solving nonlinear least squares problems with convex constraints is described. Various versions of the LM method have been proposed, their main differences being in the choice of a damping parameter. In this paper, we propose a new rule for updating the parameter so as to achieve both global and local convergence even under the presence of a convex constraint set. The key to our results is a new perspective of the LM method from majorization-minimization methods. Specifically, we show that if the damping parameter is set in a specific way, the objective function of the standard subproblem in LM methods becomes an upper bound on the original objective function under certain standard assumptions. Our method solves a sequence of the subproblems approximately using an (accelerated) projected gradient method. It finds an $ε$-stationary point after $O(ε^{-2})$ computation and achieves local quadratic convergence for zero-residual problems under a local error bound condition. Numerical results on compressed sensing and matrix factorization show that our method converges faster in many cases than existing methods.

Majorization-Minimization-Based Levenberg--Marquardt Method for Constrained Nonlinear Least Squares

Abstract

A new Levenberg--Marquardt (LM) method for solving nonlinear least squares problems with convex constraints is described. Various versions of the LM method have been proposed, their main differences being in the choice of a damping parameter. In this paper, we propose a new rule for updating the parameter so as to achieve both global and local convergence even under the presence of a convex constraint set. The key to our results is a new perspective of the LM method from majorization-minimization methods. Specifically, we show that if the damping parameter is set in a specific way, the objective function of the standard subproblem in LM methods becomes an upper bound on the original objective function under certain standard assumptions. Our method solves a sequence of the subproblems approximately using an (accelerated) projected gradient method. It finds an -stationary point after computation and achieves local quadratic convergence for zero-residual problems under a local error bound condition. Numerical results on compressed sensing and matrix factorization show that our method converges faster in many cases than existing methods.

Paper Structure

This paper contains 43 sections, 16 theorems, 97 equations, 5 figures, 5 tables, 3 algorithms.

Key Result

Lemma 1

Let $\mathcal{X} \subseteq \mathbb R^d$ be any closed convex set, and suppose $x_k \in \mathcal{X}$. Moreover, assume that for some constant $L > 0$, Then for any $\lambda > 0$ and $x \in \mathcal{X}$ such that the following bound holds:

Figures (5)

  • Figure 1: Minimization of the Rosenbrock function rosenbrock1960automatic, $f(x, y) = (x - 1)^2 + 100 (y - x^2)^2$. Both the gradient descent (GD) and our LM start from $(-1, 1)$ and converge to the optimal solution, $(1, 1)$. One marker corresponds to one iteration, and the GD and LM are truncated after 1000 and 20 iterations, respectively.
  • Figure 2: Results of compressed sensing (problem \ref{['eq:problem_cs']}).
  • Figure 3: Results of NMF (problem \ref{['eq:problem_nmf']}).
  • Figure 4: Results of NMF (problem \ref{['eq:problem_nmf']}) by the TRF method.
  • Figure 5: Results of autoencoder with MNIST (problem \ref{['eq:problem_autoencoder']}).

Theorems & Definitions (37)

  • Lemma 1
  • Remark 1
  • Remark 2
  • Definition 1: see, e.g., Definition 1 in nesterov2013gradient
  • Lemma 2
  • proof
  • Lemma 3
  • Lemma 4
  • proof
  • Lemma 5
  • ...and 27 more