Table of Contents
Fetching ...

Neural Operators with Localized Integral and Differential Kernels

Miguel Liu-Schiaffini, Julius Berner, Boris Bonev, Thorsten Kurth, Kamyar Azizzadenesheli, Anima Anandkumar

TL;DR

The paper addresses the need for local inductive biases in neural operators to better capture local PDE features while preserving discretization independence. It introduces two localized operator families: a differential kernel layer that converges to differential operators under refinement, and DISCO-based local integral kernel layers that realize local integral transforms on general geometries. By augmenting Fourier neural operators with these two local branches, the authors demonstrate significant performance gains (up to ~72% relative L2-error reduction) across Darcy flow, turbulent Navier–Stokes, diffusion–reaction, and spherical shallow-water problems, including unstructured meshes. The work highlights improved generalization across resolutions and geometries, and provides a principled path to combining local and global operators in neural operator architectures for scientific computing.

Abstract

Neural operators learn mappings between function spaces, which is practical for learning solution operators of PDEs and other scientific modeling applications. Among them, the Fourier neural operator (FNO) is a popular architecture that performs global convolutions in the Fourier space. However, such global operations are often prone to over-smoothing and may fail to capture local details. In contrast, convolutional neural networks (CNN) can capture local features but are limited to training and inference at a single resolution. In this work, we present a principled approach to operator learning that can capture local features under two frameworks by learning differential operators and integral operators with locally supported kernels. Specifically, inspired by stencil methods, we prove that we obtain differential operators under an appropriate scaling of the kernel values of CNNs. To obtain local integral operators, we utilize suitable basis representations for the kernels based on discrete-continuous convolutions. Both these approaches preserve the properties of operator learning and, hence, the ability to predict at any resolution. Adding our layers to FNOs significantly improves their performance, reducing the relative L2-error by 34-72% in our experiments, which include a turbulent 2D Navier-Stokes and the spherical shallow water equations.

Neural Operators with Localized Integral and Differential Kernels

TL;DR

The paper addresses the need for local inductive biases in neural operators to better capture local PDE features while preserving discretization independence. It introduces two localized operator families: a differential kernel layer that converges to differential operators under refinement, and DISCO-based local integral kernel layers that realize local integral transforms on general geometries. By augmenting Fourier neural operators with these two local branches, the authors demonstrate significant performance gains (up to ~72% relative L2-error reduction) across Darcy flow, turbulent Navier–Stokes, diffusion–reaction, and spherical shallow-water problems, including unstructured meshes. The work highlights improved generalization across resolutions and geometries, and provides a principled path to combining local and global operators in neural operator architectures for scientific computing.

Abstract

Neural operators learn mappings between function spaces, which is practical for learning solution operators of PDEs and other scientific modeling applications. Among them, the Fourier neural operator (FNO) is a popular architecture that performs global convolutions in the Fourier space. However, such global operations are often prone to over-smoothing and may fail to capture local details. In contrast, convolutional neural networks (CNN) can capture local features but are limited to training and inference at a single resolution. In this work, we present a principled approach to operator learning that can capture local features under two frameworks by learning differential operators and integral operators with locally supported kernels. Specifically, inspired by stencil methods, we prove that we obtain differential operators under an appropriate scaling of the kernel values of CNNs. To obtain local integral operators, we utilize suitable basis representations for the kernels based on discrete-continuous convolutions. Both these approaches preserve the properties of operator learning and, hence, the ability to predict at any resolution. Adding our layers to FNOs significantly improves their performance, reducing the relative L2-error by 34-72% in our experiments, which include a turbulent 2D Navier-Stokes and the spherical shallow water equations.
Paper Structure (42 sections, 1 theorem, 41 equations, 8 figures, 5 tables)

This paper contains 42 sections, 1 theorem, 41 equations, 8 figures, 5 tables.

Key Result

Proposition 3.1

Let $D_h\subset \mathbb{R}^d$ be a regular grid of width $h$ and let $v\in C^1(D, \mathbb{R}^n)$. Then, for every kernel $(K_i)_{i=1}^S \subset \mathbb{R}^{n}$, there exists $(b_j)_{j=1}^n\subset\mathbb{R}^d$ such that for every $y\in D_h$ , where $\bar{K}=\sum_{i=1}^S K_i.$

Figures (8)

  • Figure 2: A single layer of our local neural operator. We add (up to) two local operations using the convolutions with differential kernel (\ref{['sec:diff_layer']}) and local integral kernel (\ref{['sec:disco']}).
  • Figure 3: Initial condition, ground truth, and corresponding autoregressive predictions of our proposed models for the Navier-Stokes problem and the shallow water equations.
  • Figure 4: Ground truth and prediction of the horizontal velocity for the flow past a cylinder. The data is represented on an unstructured mesh, which is visualized in gray color. Our proposed architecture can be readily applied to data on unstructured meshes such as this one. For this figure, we learn the residual to the previous time step.
  • Figure 5: Empirical evaluation of the proposed differential kernel. (a) $L^2$ errors at various resolutions and quadratic coefficient scales $c$. (b) True differential operator for $c = 1$. (c) Output of the differential kernel at a resolution of $32 \times 32$. (d) Output of the differential kernel at a resolution of $64 \times 64$. (e) Output of the differential kernel at a resolution of $4096 \times 4096$.
  • Figure 6: Radial, piecewise linear basis functions for the approximation of anisotropic filters on the sphere.
  • ...and 3 more figures

Theorems & Definitions (6)

  • Proposition 3.1: First-order differential layer
  • Remark 3.2: Higher-order differential operator
  • Remark 3.3: Exact integration and equivariance
  • Definition 2.1: Group Convolution
  • Remark 2.2
  • Definition 2.3: DISCO convolutions