A neuron-wise subspace correction method for the finite neuron method
Jongho Park, Jinchao Xu, Xiaofeng Xu
TL;DR
This work introduces Neuron-wise Parallel Subspace Correction (NPSC) for the finite neuron method, addressing the slow convergence of gradient-based training caused by ill-conditioning in the linear layer. By decomposing the parameter space into a linear part and per-neuron nonlinear parts, NPSC alternates between a linear-layer solve and parallel local neuron optimizations, leveraging a newly designed optimal 1D preconditioner that makes the linear solve cost and iterations independent of the neuron count. The nonlinear neuron updates use a Levenberg–Marquardt strategy to find good local minima, with an adjustment step and backtracking to stabilize and accelerate convergence. Across function-approximation and elliptic PDE experiments, NPSC outperforms standard gradient-based methods and ablations confirm the critical roles of preconditioning, adaptive learning rates, and neuron-wise optimization in achieving higher accuracy and robustness, including in oscillatory and higher-dimensional settings where quadrature costs are nontrivial.
Abstract
In this paper, we propose a novel algorithm called Neuron-wise Parallel Subspace Correction Method (NPSC) for the finite neuron method that approximates numerical solutions of partial differential equations (PDEs) using neural network functions. Despite extremely extensive research activities in applying neural networks for numerical PDEs, there is still a serious lack of effective training algorithms that can achieve adequate accuracy, even for one-dimensional problems. Based on recent results on the spectral properties of linear layers and landscape analysis for single neuron problems, we develop a special type of subspace correction method that optimizes the linear layer and each neuron in the nonlinear layer separately. An optimal preconditioner that resolves the ill-conditioning of the linear layer is presented for one-dimensional problems, so that the linear layer is trained in a uniform number of iterations with respect to the number of neurons. In each single neuron problem, a good local minimum that avoids flat energy regions is found by a superlinearly convergent algorithm. Numerical experiments on function approximation problems and PDEs demonstrate better performance of the proposed method than other gradient-based methods.
