SVD-Preconditioned Gradient Descent Method for Solving Nonlinear Least Squares Problems

Zhipeng Chang; Wenrui Hao; Nian Liu

SVD-Preconditioned Gradient Descent Method for Solving Nonlinear Least Squares Problems

Zhipeng Chang, Wenrui Hao, Nian Liu

TL;DR

This work introduces SPGD, a SVD-based preconditioned gradient method for nonlinear least-squares problems, and integrates it with Adam-style adaptivity to form SPGD-Adam. By leveraging the local spectral information of the Jacobian, SPGD achieves a more favorable convergence factor than classical gradient descent, and the modified Adam variant provides global convergence under AMSGrad-style stabilization and regularized preconditioning. The authors supply a rigorous convergence analysis establishing local linear convergence for SPGD and global convergence for the modified Adam framework, along with detailed bounds on error terms. Empirically, SPGD and SPGD-Adam demonstrate faster convergence and lower residuals across function-approximation tasks, PDE-like problems, and image-classification settings (CIFAR-10) compared with standard Adam, highlighting the practical impact of problem-structure–driven preconditioning.

Abstract

This paper introduces a novel optimization algorithm designed for nonlinear least-squares problems. The method is derived by preconditioning the gradient descent direction using the Singular Value Decomposition (SVD) of the Jacobian. This SVD-based preconditioner is then integrated with the first- and second-moment adaptive learning rate mechanism of the Adam optimizer. We establish the local linear convergence of the proposed method under standard regularity assumptions and prove global convergence for a modified version of the algorithm under suitable conditions. The effectiveness of the approach is demonstrated experimentally across a range of tasks, including function approximation, partial differential equation (PDE) solving, and image classification on the CIFAR-10 dataset. Results show that the proposed method consistently outperforms standard Adam, achieving faster convergence and lower error in both regression and classification settings.

SVD-Preconditioned Gradient Descent Method for Solving Nonlinear Least Squares Problems

TL;DR

Abstract

Paper Structure (16 sections, 3 theorems, 72 equations, 1 figure, 1 algorithm)

This paper contains 16 sections, 3 theorems, 72 equations, 1 figure, 1 algorithm.

Introduction
SVD-Preconditioned Gradient Descent (SPGD) method
Method derivation
Local convergence
Adam framework with SPGD method
Modified Adam algorithm
Global convergence
Assumptions
Convergence theorem
Bounding $I_{1,t}$
Bounding $I_{2,t}$
Decomposing $I_{3,t}$
Bounding $I_{4,t}$
Summarization
Numerical Experiments
...and 1 more sections

Key Result

Theorem 1

Suppose that $F$ satisfies the conditions in Assumption assump:regular. Consider the GD iteration where $\alpha>0$ denotes the step size. Then, for any sufficiently small $\alpha$, there exists a neighborhood $U_{\alpha}\subset U$ of $\theta^*$ such that the gradient descent sequence $\{\theta_t\}_{t\geq 0}$ generated by equ:gd iteration in thm with initial point $\theta_0\in U_{\alpha}$ satisfi

Figures (1)

Figure 1: Scenario I (Varying Frequency): Test loss versus the number of epochs with fixed dimension $d$ and varying frequency parameters $n \in \{5, 7\}$. The shaded region in each subplot indicates the interquartile range over 10 independent runs with different random seeds.

Theorems & Definitions (11)

Theorem 1: Local linear convergence of GD near a regular equilibrium
proof
Remark 1
Theorem 2: Local linear convergence of the SPGD method near a regular equilibrium
proof
Remark 2
Remark 3: Extension to cross-entropy loss
Remark 4: Efficient computation for large-scale networks
Remark 5
Theorem 3
...and 1 more

SVD-Preconditioned Gradient Descent Method for Solving Nonlinear Least Squares Problems

TL;DR

Abstract

SVD-Preconditioned Gradient Descent Method for Solving Nonlinear Least Squares Problems

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (11)