Table of Contents
Fetching ...

Inexact Gauss-Newton methods with matrix approximation by sampling for nonlinear least-squares and systems

Stefania Bellavia, Greta Malaspina, Benedetta Morini

TL;DR

This paper develops stochastic inexact Gauss-Newton methods for large-scale nonlinear least-squares problems and nonlinear systems by constructing random local models through row compression or Jacobian sampling, coupled with line-search globalization. It provides a probabilistic framework that bounds the required sample sizes to achieve a prescribed first-order accuracy with high probability, and analyzes the resulting iteration and sample complexities. The methods leverage Krylov solvers for inexact model minimization and include Levenberg–Marquardt regularization to ensure descent. Numerical experiments on binary classification and integral-equation discretizations demonstrate favorable trade-offs between computational cost and convergence compared to full Jacobian approaches, highlighting practical gains from sampling strategies such as importance and uniform sampling. The results support the practical viability of stochastic, model-based Gauss-Newton schemes for large-scale problems, with clear guidance on setup and expected performance.

Abstract

We develop and analyze stochastic inexact Gauss-Newton methods for nonlinear least-squares problems and for nonlinear systems ofequations. Random models are formed using suitable sampling strategies for the matrices involved in the deterministic models. The analysis of the expected number of iterations needed in the worst case to achieve a desired level of accuracy in the first-order optimality condition provides guidelines for applying sampling and enforcing, with \minor{a} fixed probability, a suitable accuracy in the random approximations. Results of the numerical validation of the algorithms are presented.

Inexact Gauss-Newton methods with matrix approximation by sampling for nonlinear least-squares and systems

TL;DR

This paper develops stochastic inexact Gauss-Newton methods for large-scale nonlinear least-squares problems and nonlinear systems by constructing random local models through row compression or Jacobian sampling, coupled with line-search globalization. It provides a probabilistic framework that bounds the required sample sizes to achieve a prescribed first-order accuracy with high probability, and analyzes the resulting iteration and sample complexities. The methods leverage Krylov solvers for inexact model minimization and include Levenberg–Marquardt regularization to ensure descent. Numerical experiments on binary classification and integral-equation discretizations demonstrate favorable trade-offs between computational cost and convergence compared to full Jacobian approaches, highlighting practical gains from sampling strategies such as importance and uniform sampling. The results support the practical viability of stochastic, model-based Gauss-Newton schemes for large-scale problems, with clear guidance on setup and expected performance.

Abstract

We develop and analyze stochastic inexact Gauss-Newton methods for nonlinear least-squares problems and for nonlinear systems ofequations. Random models are formed using suitable sampling strategies for the matrices involved in the deterministic models. The analysis of the expected number of iterations needed in the worst case to achieve a desired level of accuracy in the first-order optimality condition provides guidelines for applying sampling and enforcing, with \minor{a} fixed probability, a suitable accuracy in the random approximations. Results of the numerical validation of the algorithms are presented.
Paper Structure (18 sections, 10 theorems, 70 equations, 7 figures, 1 table)

This paper contains 18 sections, 10 theorems, 70 equations, 7 figures, 1 table.

Key Result

Lemma 2.1

Let $s_k, \widetilde{J}_k, g_k$ as in Algorithm algo. Then $s_k^Tg_k\le 0.$

Figures (7)

  • Figure 1: Algorithm SGN_RC, $\alpha=10$, varying $m_{\max}$ and $\gamma$. Median run in terms of cost: logarithmic norm of the residual versus computational cost.
  • Figure 2: Algorithm SGN_RC, $\alpha=10$, varying $m_{\max}$ and $\gamma$. Median run in terms of cost: accuracy versus computational cost.
  • Figure 3: Algorithm SGN_JS and Integral Equation IE. Importance sampling. Median run in terms of cost: logarithmic norm of the residual versus computational cost.
  • Figure 4: Algorithm SGN_JS and Integral Equation IE. Uniform sampling, density of the sparsified Jacobian equal to $s$. Median run in terms of cost: logarithmic norm of the residual versus computational cost.
  • Figure 5: Algorithm SGN_JS with random sparsification. Median run in terms of cost. Computational cost versus logarithmic norm of the residual (a) and accuracy (b).
  • ...and 2 more figures

Theorems & Definitions (22)

  • Lemma 2.1
  • proof
  • Theorem 2.2
  • Lemma 2.3
  • proof
  • Definition 3.2
  • Lemma 3.3
  • proof
  • Lemma 3.4
  • proof
  • ...and 12 more