Table of Contents
Fetching ...

Enhancing Solutions for Complex PDEs: Introducing Complementary Convolution and Equivariant Attention in Fourier Neural Operators

Xuanle Zhao, Yue Sun, Tielin Zhang, Bo Xu

TL;DR

This work addresses the challenge of solving complex, multiscale PDEs with neural operators, noting that the Fourier Neural Operator (FNO) tends to learn kernels primarily in the low-frequency domain. It introduces a hierarchical attentive Fourier neural operator that combines convolutional-residual Fourier layers with translation-equivariant attention to jointly capture low- and high-frequency features across scales. The authors prove that the attentive convolution is translation-equivariant and that Fourier transforms commute with orthogonal group actions, enabling symmetry-respecting operation in the frequency domain. Empirically, the method achieves superior performance on forward and inverse problems involving multiscale elliptic equations and Navier–Stokes dynamics, particularly under rapid coefficient variations and noisy data, demonstrating improved robustness and accuracy over prior PDE solvers.

Abstract

Neural operators improve conventional neural networks by expanding their capabilities of functional mappings between different function spaces to solve partial differential equations (PDEs). One of the most notable methods is the Fourier Neural Operator (FNO), which draws inspiration from Green's function method and directly approximates operator kernels in the frequency domain. However, after empirical observation followed by theoretical validation, we demonstrate that the FNO approximates kernels primarily in a relatively low-frequency domain. This suggests a limited capability in solving complex PDEs, particularly those characterized by rapid coefficient changes and oscillations in the solution space. Such cases are crucial in specific scenarios, like atmospheric convection and ocean circulation. To address this challenge, inspired by the translation equivariant of the convolution kernel, we propose a novel hierarchical Fourier neural operator along with convolution-residual layers and attention mechanisms to make them complementary in the frequency domain to solve complex PDEs. We perform experiments on forward and reverse problems of multiscale elliptic equations, Navier-Stokes equations, and other physical scenarios, and find that the proposed method achieves superior performance in these PDE benchmarks, especially for equations characterized by rapid coefficient variations.

Enhancing Solutions for Complex PDEs: Introducing Complementary Convolution and Equivariant Attention in Fourier Neural Operators

TL;DR

This work addresses the challenge of solving complex, multiscale PDEs with neural operators, noting that the Fourier Neural Operator (FNO) tends to learn kernels primarily in the low-frequency domain. It introduces a hierarchical attentive Fourier neural operator that combines convolutional-residual Fourier layers with translation-equivariant attention to jointly capture low- and high-frequency features across scales. The authors prove that the attentive convolution is translation-equivariant and that Fourier transforms commute with orthogonal group actions, enabling symmetry-respecting operation in the frequency domain. Empirically, the method achieves superior performance on forward and inverse problems involving multiscale elliptic equations and Navier–Stokes dynamics, particularly under rapid coefficient variations and noisy data, demonstrating improved robustness and accuracy over prior PDE solvers.

Abstract

Neural operators improve conventional neural networks by expanding their capabilities of functional mappings between different function spaces to solve partial differential equations (PDEs). One of the most notable methods is the Fourier Neural Operator (FNO), which draws inspiration from Green's function method and directly approximates operator kernels in the frequency domain. However, after empirical observation followed by theoretical validation, we demonstrate that the FNO approximates kernels primarily in a relatively low-frequency domain. This suggests a limited capability in solving complex PDEs, particularly those characterized by rapid coefficient changes and oscillations in the solution space. Such cases are crucial in specific scenarios, like atmospheric convection and ocean circulation. To address this challenge, inspired by the translation equivariant of the convolution kernel, we propose a novel hierarchical Fourier neural operator along with convolution-residual layers and attention mechanisms to make them complementary in the frequency domain to solve complex PDEs. We perform experiments on forward and reverse problems of multiscale elliptic equations, Navier-Stokes equations, and other physical scenarios, and find that the proposed method achieves superior performance in these PDE benchmarks, especially for equations characterized by rapid coefficient variations.
Paper Structure (42 sections, 27 equations, 5 figures, 6 tables)

This paper contains 42 sections, 27 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Examples of various tasks, including Darcy-Rough, Trigonometric, Darcy-Smooth, Navier-Stokes, Pipe, and Elasticity datasets. These datasets solve equations according to coefficients, previous solutions, and structures by approximating mappings between input and output in coordinate spaces. All these tasks are covered in experimental verification.
  • Figure 2: The overall network architecture. The input is downsampled and processed at each scale using equivariant attention and convolutional-residual Fourier layers. The final output is obtained by upsampling the outputs of hierarchical layers.
  • Figure 3: Showcase of Darcy-Rough Elliptic Equations, where the high-frequency components are moved to the center. G.T. and F.D. denote the ground truth and frequency domain. The absolute error is computed as $|u-\hat{u}|$.
  • Figure 4: Showcase of prediction results and absolute error of our model in the Navier-Stokes equation dataset. First row: $\textbf{\{2, 4, 6, 8, 10\}}$ time steps input sequence; Second row: $\textbf{\{12, 14, 16, 18, 20\}}$ time steps ground truth sequence; Third row: corresponding prediction sequence; Last row: corresponding absolute prediction error $|\hat{x}-y|$.
  • Figure 5: Showcase of multiscale PDEs, where the high-frequency components are moved to the center. G.T. and F.D. denote the ground truth and frequency domain. The absolute error is computed as $|u-\hat{u}|$.