Convergence Analysis of Block Newton Methods for 1D Shallow Neural Network Approximation
Zhiqiang Cai, Anastassia Doktorova, Robert D. Falgout, César Herrera
TL;DR
This work addresses local convergence of Block Newton (BN) methods for one-dimensional shallow ReLU networks with $n$ neurons, where the parameter vector is $θ ∈ ℝ^{2n+1}$ and split into linear $c ∈ ℝ^{n+1}$ and nonlinear $b ∈ ℝ^n$. It develops a 2×2 block BN framework with outer iterations (NL-GS, L-GS, Jacobi) and inner Newton solves, and analyzes convergence via a fixed-point map $G(θ)$ with Jacobian $J_G(θ^*)$, establishing local convergence when the Hessian $∇^2_θ F(θ^*)$ is SPD and the block inverses exist. The reduced BN (rBN) further drops non-contributing neurons to shrink the parameter set while preserving convergence under the same SPD-type conditions. Applications to 1D diffusion-reaction and least-squares approximation demonstrate practical impact, including a numerical example showing interior-layer recovery and substantial error reduction.
Abstract
This paper analyzes local convergence of the block Newton (BN) method introduced in [5, 6] for one-dimensional shallow neural network approximation to functions and diffusion-reaction problems. The BN method consists of the 2x2 block nonlinear Gauss-Seidel, linear Gauss-Seidel, or Jacobi method for outer iteration and the Newton method for inner iteration. The blocks are corresponding to the linear and the nonlinear parameters. Under some reasonable assumptions, we establish local convergence of the BN methods as well as the reduced BN (rBN) method for one-dimensional diffusion-reaction problems and least-squares function approximation. Unlike common optimization methods, the rBN allows for the reduction of the number of parameters during the optimization process when some neurons contribute little to the approximation or are at nearly optimal locations.
