Table of Contents
Fetching ...

Koopman neural operator as a mesh-free solver of non-linear partial differential equations

Wei Xiong, Xiaomeng Huang, Ziyang Zhang, Ruixuan Deng, Pei Sun, Yang Tian

TL;DR

The paper tackles the challenge of solving nonlinear PDE families with reliable long-term predictions in a mesh-free setting. It introduces the Koopman neural operator (KNO), which learns a time-dependent Koopman operator that acts on the flow mapping to convert nonlinear PDE evolution into a linear predictive problem, using an offline, ergodicity-guided Hankel-Krylov representation to approximate the operator. The architecture combines Fourier-domain processing, a Hankel-based Koopman module, a high-frequency complement, and a reconstruction-driven loss, enabling robust mesh-independence and strong zero-shot generalization across discretizations and time horizons. Experiments on 1D/2D PDEs and real dynamic systems demonstrate improved accuracy and stability over state-of-the-art neural operators, with practical implications for PDE solving, turbulence modeling, and forecasting of complex geophysical and atmospheric processes.

Abstract

The lacking of analytic solutions of diverse partial differential equations (PDEs) gives birth to a series of computational techniques for numerical solutions. Although numerous latest advances are accomplished in developing neural operators, a kind of neural-network-based PDE solver, these solvers become less accurate and explainable while learning long-term behaviors of non-linear PDE families. In this paper, we propose the Koopman neural operator (KNO), a new neural operator, to overcome these challenges. With the same objective of learning an infinite-dimensional mapping between Banach spaces that serves as the solution operator of the target PDE family, our approach differs from existing models by formulating a non-linear dynamic system of equation solution. By approximating the Koopman operator, an infinite-dimensional operator governing all possible observations of the dynamic system, to act on the flow mapping of the dynamic system, we can equivalently learn the solution of a non-linear PDE family by solving simple linear prediction problems. We validate the KNO in mesh-independent, long-term, and5zero-shot predictions on five representative PDEs (e.g., the Navier-Stokes equation and the Rayleigh-B{é}nard convection) and three real dynamic systems (e.g., global water vapor patterns and western boundary currents). In these experiments, the KNO exhibits notable advantages compared with previous state-of-the-art models, suggesting the potential of the KNO in supporting diverse science and engineering applications (e.g., PDE solving, turbulence modelling, and precipitation forecasting).

Koopman neural operator as a mesh-free solver of non-linear partial differential equations

TL;DR

The paper tackles the challenge of solving nonlinear PDE families with reliable long-term predictions in a mesh-free setting. It introduces the Koopman neural operator (KNO), which learns a time-dependent Koopman operator that acts on the flow mapping to convert nonlinear PDE evolution into a linear predictive problem, using an offline, ergodicity-guided Hankel-Krylov representation to approximate the operator. The architecture combines Fourier-domain processing, a Hankel-based Koopman module, a high-frequency complement, and a reconstruction-driven loss, enabling robust mesh-independence and strong zero-shot generalization across discretizations and time horizons. Experiments on 1D/2D PDEs and real dynamic systems demonstrate improved accuracy and stability over state-of-the-art neural operators, with practical implications for PDE solving, turbulence modeling, and forecasting of complex geophysical and atmospheric processes.

Abstract

The lacking of analytic solutions of diverse partial differential equations (PDEs) gives birth to a series of computational techniques for numerical solutions. Although numerous latest advances are accomplished in developing neural operators, a kind of neural-network-based PDE solver, these solvers become less accurate and explainable while learning long-term behaviors of non-linear PDE families. In this paper, we propose the Koopman neural operator (KNO), a new neural operator, to overcome these challenges. With the same objective of learning an infinite-dimensional mapping between Banach spaces that serves as the solution operator of the target PDE family, our approach differs from existing models by formulating a non-linear dynamic system of equation solution. By approximating the Koopman operator, an infinite-dimensional operator governing all possible observations of the dynamic system, to act on the flow mapping of the dynamic system, we can equivalently learn the solution of a non-linear PDE family by solving simple linear prediction problems. We validate the KNO in mesh-independent, long-term, and5zero-shot predictions on five representative PDEs (e.g., the Navier-Stokes equation and the Rayleigh-B{é}nard convection) and three real dynamic systems (e.g., global water vapor patterns and western boundary currents). In these experiments, the KNO exhibits notable advantages compared with previous state-of-the-art models, suggesting the potential of the KNO in supporting diverse science and engineering applications (e.g., PDE solving, turbulence modelling, and precipitation forecasting).
Paper Structure (30 sections, 28 equations, 5 figures, 1 table)

This paper contains 30 sections, 28 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Conceptual illustrations of neural network architectures of the KNO. Note that the layout of every part is slightly reorganized to offer a clear version.
  • Figure 2: Experiment results of mesh-independence. (a-d) respectively present the results under different conditions of high-frequency complement designs and $\lambda$ in the KNO. Here the high-frequency complement is either realized by a single convolutional layer, $\mathcal{C}_{s}$, or by a simple tripartite convolutional network, $\mathcal{C}_{t}$. Parameter $\lambda$ controls the contributions of low- and high-frequency information. Model performance is measured using the root mean square error (RMSE). In the legends of (a-d), the numbers within brackets indicate the parameter settings and model sizes, where the last number corresponds to the model size and all other numbers denote parameters. Notion $w$ in the FNO stands for the width parameter (i.e., the dimension of latent space) li2020fourier. Note that the results of the FNO are not repeatedly shown in (b-d) since the adjustment of $\lambda$ has no relation with the FNO. Therefore, the performances of the KNO models in (b-d) can be directly compared with the performance of the FNO in (a).
  • Figure 3: Experiment results of long-term prediction. (a-h) show the performances of all models on different data sets, where each prediction step creates a time frame. (i) visualizes the prediction results (the twenty-sixth time frame after the initial condition is selected as an instance for illustration) and the associated errors of all models on the Rayleigh-Bénard convection. Note that color bars in (i) are shared by all models. Therefore, the results can be directly compared across different models. (j) presents the prediction results (the third time frame after the initial condition) and the associated errors of the KNO on the Kuroshio and Gulf Stream currents.
  • Figure 4: Results of the zero-shot experiment (discretization granularity) on the 1-dimensional Bateman–Burgers equation. (a-d) respectively present the results under different conditions of high-frequency complement designs and $\lambda$ in the KNO. Here the high-frequency complement can be either realized by a single convolutional layer, $\mathcal{C}_{s}$, or by a simple tripartite convolutional network, $\mathcal{C}_{t}$. Parameter $\lambda$ controls the contributions of low- and high-frequency information.
  • Figure 5: Results of the zero-shot experiment (prediction interval) on the 2-dimensional Rayleigh-Bénard convection data. (a-b) show the results of the first kind of zero-shot prediction, where models are supervised by the time frames separated by 2 or 4 untrained frames during training, respectively. The result of the ResNet is absent in (b) because the ResNet does not converge well during training. (c-d) present the results of the second type of zero-shot prediction, where there are 20 or 30 unsupervised time frames during testing. Note that the lines shown in (a-d) are smoothed using the B-spline basis offered by Scipy virtanen2020scipy because the raw performances of the ResNet and the CNO are oscillating. (e) visualizes the prediction results (the twenty-sixth time frame after the initial condition is selected as an instance for illustration) and the associated errors of all models during the zero-shot experiment shown in (d).