Table of Contents
Fetching ...

LOGLO-FNO: Efficient Learning of Local and Global Features in Fourier Neural Operators

Marimuthu Kalimuthu, David Holzmüller, Mathias Niepert

TL;DR

This work proposes two key architectural enhancements to improve Fourier Neural Operators' spectral learning capabilities to represent a broad range of frequency components, including a parallel branch performing local spectral convolution and a novel frequency-sensitive loss based on radially binned spectral errors.

Abstract

Modeling high-frequency information is a critical challenge in scientific machine learning. For instance, fully turbulent flow simulations of the Navier-Stokes equations at Reynolds numbers 3500 and above can generate high-frequency signals due to swirling fluid motions caused by eddies and vortices. Faithfully modeling such signals using neural nets depends on the accurate reconstruction of moderate to high frequencies. However, it has been well known that neural nets exhibit spectral or frequency bias towards learning low-frequency components. Meanwhile, Fourier Neural Operators (FNOs) have emerged as a popular class of data-driven models for surrogate modeling and solving PDEs. Although impressive results were achieved on several PDE benchmark problems, FNOs perform poorly in learning non-dominant frequencies characterized by local features. This limitation stems from spectral bias inherent in neural nets and the explicit exclusion of high-frequency modes in FNOs and their variants. Therefore, to mitigate these issues and improve FNO's spectral learning capabilities to represent a broad range of frequency components, we propose two key architectural enhancements: (i) a parallel branch performing local spectral convolution (ii) a high-frequency propagation module. Moreover, we propose a novel frequency-sensitive loss based on radially binned spectral errors. This introduction of a parallel branch for local convolution reduces the trainable parameters by up to 50% while achieving the accuracy of FNO that relies solely on global convolution. Moreover, our findings demonstrate that the proposed model improves stability over longer rollouts. Experiments on six challenging PDEs in fluid mechanics, wave propagation, and biological pattern formation, and the qualitative and spectral analysis of predictions, show the effectiveness of our method over SOTA neural operator families of baselines.

LOGLO-FNO: Efficient Learning of Local and Global Features in Fourier Neural Operators

TL;DR

This work proposes two key architectural enhancements to improve Fourier Neural Operators' spectral learning capabilities to represent a broad range of frequency components, including a parallel branch performing local spectral convolution and a novel frequency-sensitive loss based on radially binned spectral errors.

Abstract

Modeling high-frequency information is a critical challenge in scientific machine learning. For instance, fully turbulent flow simulations of the Navier-Stokes equations at Reynolds numbers 3500 and above can generate high-frequency signals due to swirling fluid motions caused by eddies and vortices. Faithfully modeling such signals using neural nets depends on the accurate reconstruction of moderate to high frequencies. However, it has been well known that neural nets exhibit spectral or frequency bias towards learning low-frequency components. Meanwhile, Fourier Neural Operators (FNOs) have emerged as a popular class of data-driven models for surrogate modeling and solving PDEs. Although impressive results were achieved on several PDE benchmark problems, FNOs perform poorly in learning non-dominant frequencies characterized by local features. This limitation stems from spectral bias inherent in neural nets and the explicit exclusion of high-frequency modes in FNOs and their variants. Therefore, to mitigate these issues and improve FNO's spectral learning capabilities to represent a broad range of frequency components, we propose two key architectural enhancements: (i) a parallel branch performing local spectral convolution (ii) a high-frequency propagation module. Moreover, we propose a novel frequency-sensitive loss based on radially binned spectral errors. This introduction of a parallel branch for local convolution reduces the trainable parameters by up to 50% while achieving the accuracy of FNO that relies solely on global convolution. Moreover, our findings demonstrate that the proposed model improves stability over longer rollouts. Experiments on six challenging PDEs in fluid mechanics, wave propagation, and biological pattern formation, and the qualitative and spectral analysis of predictions, show the effectiveness of our method over SOTA neural operator families of baselines.

Paper Structure

This paper contains 47 sections, 20 equations, 15 figures, 1 algorithm.

Figures (15)

  • Figure 1: The overall architecture of our proposed LOGLO-FNO model. $\mathbf{X}$ is the discretization of $a(x)$, $\hat{\mathbf{X}}$ the patched version, and $\mathbf{X}_H$ the extracted high frequencies (cf. Eqn. \ref{['eq:hfp-module-nd']}). The full network has $\mathcal{L}$ repetitions of identical LOGLO Fourier layers, and the final activation function is applied on all but the last LOGLO Fourier layer. The features from the three branches and the skip connections are fused by a simple summation.
  • Figure 2: Representative illustration of radially binned spectral error. Left: the spectral energy of error magnitude with overlaid radial bins (blue -- low frequency, green -- mid frequency cutoff, the rest -- high frequency band); Right: the energy of the radially binned spectral error across radial distances as a line plot.
  • Figure 3: 1-step and 5-step AR evaluation of LOGLO-FNO compared with SOTA baselines on the test set of 2D Kolmogorov Flow markov-neural-operator-li:2022. Rel. % Diff indicates improvement (-) or degradation (+) with respect to FNO. LOGLO-FNO uses 40 and (16, 9) modes in the global and local branches, respectively, whereas the width is set as 65. $\textit{NO-LIDK\xspace}^\ast$ denotes using only localized integral kernel, $\text{NO-LIDK\xspace}^\diamond$ means only differential kernel, and $\text{NO-LIDK\xspace}^\dagger$ means employing both. $\text{Transolver\xspace}^\star$ indicates a longer training time of the model for 500 epochs due to convergence issues at shorter training epochs of 136.
  • Figure 4: Comparison of FNO vs. LOGLO-FNO showing 1-step fRMSE ($\downarrow$) on the test set of Kolmogorov Flow dataset ($Re=5k$) markov-neural-operator-li:2022 for a varying number of modes in the global branch and the full set of modes (i.e., (16, 9) for patch size ($16 \times 16$)) in the local branch.
  • Figure 5: Comparison of radially binned spectral errors of baselines and LOGLO-FNO on the test set of Kolmogorov Flow ($Re=5k$) markov-neural-operator-li:2022.
  • ...and 10 more figures