Table of Contents
Fetching ...

Spectral bias in physics-informed and operator learning: Analysis and mitigation guidelines

Siavash Khodakarami, Vivek Oommen, Nazanin Ahmadi Daryakenari, Maxim Beekenkamp, George Em Karniadakis

TL;DR

This work provides a systematic investigation of spectral bias in physics-informed and operator learning frameworks, with emphasis on the coupled roles of network architecture, activation functions, loss design, and optimization strategy, and demonstrates that spectral bias is not simply representational but fundamentally dynamical.

Abstract

Solving partial differential equations (PDEs) by neural networks as well as Kolmogorov-Arnold Networks (KANs), including physics-informed neural networks (PINNs), physics-informed KANs (PIKANs), and neural operators, are known to exhibit spectral bias, whereby low-frequency components of the solution are learned significantly faster than high-frequency modes. While spectral bias is often treated as an intrinsic representational limitation of neural architectures, its interaction with optimization dynamics and physics-based loss formulations remains poorly understood. In this work, we provide a systematic investigation of spectral bias in physics-informed and operator learning frameworks, with emphasis on the coupled roles of network architecture, activation functions, loss design, and optimization strategy. We quantify spectral bias through frequency-resolved error metrics, Barron-norm diagnostics, and higher-order statistical moments, enabling a unified analysis across elliptic, hyperbolic, and dispersive PDEs. Through diverse benchmark problems, including the Korteweg-de Vries, wave and steady-state diffusion-reaction equations, turbulent flow reconstruction, and earthquake dynamics, we demonstrate that spectral bias is not simply representational but fundamentally dynamical. In particular, second-order optimization methods substantially alter the spectral learning order, enabling earlier and more accurate recovery of high-frequency modes for all PDE types. For neural operators, we further show that spectral bias is dependent on the neural operator architecture and can also be effectively mitigated through spectral-aware loss formulations without increasing the inference cost.

Spectral bias in physics-informed and operator learning: Analysis and mitigation guidelines

TL;DR

This work provides a systematic investigation of spectral bias in physics-informed and operator learning frameworks, with emphasis on the coupled roles of network architecture, activation functions, loss design, and optimization strategy, and demonstrates that spectral bias is not simply representational but fundamentally dynamical.

Abstract

Solving partial differential equations (PDEs) by neural networks as well as Kolmogorov-Arnold Networks (KANs), including physics-informed neural networks (PINNs), physics-informed KANs (PIKANs), and neural operators, are known to exhibit spectral bias, whereby low-frequency components of the solution are learned significantly faster than high-frequency modes. While spectral bias is often treated as an intrinsic representational limitation of neural architectures, its interaction with optimization dynamics and physics-based loss formulations remains poorly understood. In this work, we provide a systematic investigation of spectral bias in physics-informed and operator learning frameworks, with emphasis on the coupled roles of network architecture, activation functions, loss design, and optimization strategy. We quantify spectral bias through frequency-resolved error metrics, Barron-norm diagnostics, and higher-order statistical moments, enabling a unified analysis across elliptic, hyperbolic, and dispersive PDEs. Through diverse benchmark problems, including the Korteweg-de Vries, wave and steady-state diffusion-reaction equations, turbulent flow reconstruction, and earthquake dynamics, we demonstrate that spectral bias is not simply representational but fundamentally dynamical. In particular, second-order optimization methods substantially alter the spectral learning order, enabling earlier and more accurate recovery of high-frequency modes for all PDE types. For neural operators, we further show that spectral bias is dependent on the neural operator architecture and can also be effectively mitigated through spectral-aware loss formulations without increasing the inference cost.
Paper Structure (28 sections, 62 equations, 17 figures, 10 tables)

This paper contains 28 sections, 62 equations, 17 figures, 10 tables.

Figures (17)

  • Figure 1: Case 1: Training-time spectral evolution for the discontinuous benchmark. Comparison of MLP--Tanh, MLP--SIREN, cKAN, and Tanh-cKAN on the piecewise-discontinuous target function. Columns correspond to different training epochs, illustrating the evolution of the predicted signal in physical space (top row of each block) and the corresponding Fourier amplitude spectrum (bottom row of each block). The ground-truth solution is shown in black, while network predictions are shown in red. The discontinuity induces broad-band high-frequency content in the frequency domain, highlighting differences in how each architecture and optimization strategy capture sharp transitions and recover high-frequency modes over training.
  • Figure 2: Case 2: Training-time spectral evolution for the smooth multi-frequency benchmark. Comparison of MLP--Tanh, MLP--SIREN, cKAN, and Tanh-cKAN on the multi-scale multi-frequency target function. Columns correspond to different training epochs to illustrate the evolution of the predicted signal in physical space (top row of each block) and the corresponding Fourier amplitude spectrum (bottom row of each block). The ground-truth solution is shown in black, while network predictions are shown in red. Although the target function is smooth, the presence of multiple well-separated frequency components makes this benchmark sensitive to spectral bias to reveal differences in how each architecture captures and balances low- and high-frequency modes during training.
  • Figure 3: KdV equation: PINN training history with different optimizers (Adam, L-BFGS, SOAP, and SS-Broyden) and activation functions (Tanh, and SIREN) for KdV equation. The sub-figure shows the zoomed-in training history for the SS-Broyden optimizer.
  • Figure 4: KdV equation: Analysis of the predictions. (a, b) First four moments (mean, variance, skewness, and kurtosis) of the PINN predictions with (a) Adam optimizer and Tanh activation, and (b) Soap or SS-Broyden optimizer with Tanh activations. (c) Time-averaged absolute error at each moment for PINN predictions with different optimizers and activations. (d) Barron norm at each time-step of PINN predictions with different optimizers.
  • Figure 5: KdV equation: Effect of optimizer and activation function. First row: Ground truth (GT) solution, gradient of the GT solution, and Laplacian of the GT solution. The other rows in first, second and third columns show the errors in PINN solution, errors in gradient magnitude of the solution, and errors in Laplacian of the solution, respectively. The rows are organized from the most to the least accurate results (top to bottom). The axis ranges ($x \in [-1,1]$, $t \in [0,1]$) shown in top left subplot is applicable to all.
  • ...and 12 more figures