Spectrum-Informed Multistage Neural Networks: Multiscale Function Approximators of Machine Precision

Jakin Ng; Yongji Wang; Ching-Yao Lai

Spectrum-Informed Multistage Neural Networks: Multiscale Function Approximators of Machine Precision

Jakin Ng, Yongji Wang, Ching-Yao Lai

TL;DR

This work tackles the challenge of achieving machine-precision regression in scientific machine learning by introducing Spectrum-Informed Multistage Neural Networks (SI-MSNN). By initializing the first-layer Fourier feature embedding with the target's dominant Fourier modes and leveraging residue learning across multiple stages, the method mitigates spectral bias and aligns learning with NTK directions, enabling rapid convergence. In 1D and 2D turbulence settings, SI-MSNN reaches $O(10^{-16})$ precision and closely matches the target function along with its spectral power spectrum, significantly outperforming scale-factor baselines. The approach offers a promising path toward precision physics-informed ML and high-fidelity multiscale PDE solvers.

Abstract

Deep learning frameworks have become powerful tools for approaching scientific problems such as turbulent flow, which has wide-ranging applications. In practice, however, existing scientific machine learning approaches have difficulty fitting complex, multi-scale dynamical systems to very high precision, as required in scientific contexts. We propose using the novel multistage neural network approach with a spectrum-informed initialization to learn the residue from the previous stage, utilizing the spectral biases associated with neural networks to capture high frequency features in the residue, and successfully tackle the spectral bias of neural networks. This approach allows the neural network to fit target functions to double floating-point machine precision $O(10^{-16})$.

Spectrum-Informed Multistage Neural Networks: Multiscale Function Approximators of Machine Precision

TL;DR

precision and closely matches the target function along with its spectral power spectrum, significantly outperforming scale-factor baselines. The approach offers a promising path toward precision physics-informed ML and high-fidelity multiscale PDE solvers.

Abstract

Paper Structure (19 sections, 11 equations, 3 figures)

This paper contains 19 sections, 11 equations, 3 figures.

Introduction
Precision machine learning
Spectrum-informed initialization for regression
Preliminaries
Discrete Fourier transform
Fourier feature networks
Random Fourier features
Fourier feature mapping
Cosinusoidal activation
Neural tangent kernel
Methods
Problem setup
Multi-stage neural networks (MSNN)
$L^p$ Norm
Spectrum-informed initialization
...and 4 more sections

Figures (3)

Figure 1: ($a$) Target function $u_g(x, y)$ for a 2-D regression problem. ($b$) Errors of neural networks with the target function $u_g$ after different stages of training using the original multi-stage neural networks (MSNNs) where the scaling factor was used to mitigate the spectral biases of the network training. The 3rd-stage residue reaches $O(10^{-8})$. ($c$) Errors after different stages of training using spectrum-informed initialization for both weights and biases of the network, which reaches $O(10^{-13})$ by the 3rd stage of training.
Figure 2: (a) Target function $\psi(x, y)$, a single time snapshot of the numerical solution of the stream function for the 2D incompressible Navier-Stokes equations \ref{['eq:nse1']} and \ref{['eq:nse2']} with Re = 2000. (b--e) The residues after each of the four stages of training, which are the target functions for the next stages. After four stages, the residue has approached machine precision $O(10^{-16})$. (g) Result of the Spectrum-Informed Multistage Neural Network (SI-MSNN) with four stages of training. (h--j) The spectral domain of each of the residues.
Figure 3: Left: A comparison of the power spectrum of the 2D Navier-Stokes example in Fig. \ref{['fig:multistage-results']} given by a single-stage SI-MSNN and a four-stage SI-MSNN, each with three hidden layers of width 30. The single-stage SI-MSNN has first layer width $n_f = 10000$. The first stage of the four-stage SI-MSNN has first layer width $n_f = 1359$, based on the number of primary Fourier modes present, and the remaining stages all have first layer width $n_f = 10000.$ Right: The loss convergence of a single-stage SI-MSNN compared to a four-stage SI-MSNN, where training terminates when machine precision is reached.

Spectrum-Informed Multistage Neural Networks: Multiscale Function Approximators of Machine Precision

TL;DR

Abstract

Spectrum-Informed Multistage Neural Networks: Multiscale Function Approximators of Machine Precision

Authors

TL;DR

Abstract

Table of Contents

Figures (3)