Table of Contents
Fetching ...

Physics-informed neural networks for the shallow-water equations on the sphere

Alex Bihlo, Roman O. Popovych

TL;DR

The paper demonstrates that physics-informed neural networks can solve the shallow-water equations on the sphere in a meteorological setting, addressing long-time integration challenges with a practical multi-model time-splitting approach and hard-boundary enforcement. By partitioning the integration interval into consecutive subintervals and training a sequence of networks, the method achieves accurate representations of advection, geostrophic balance, mountain-induced flows, and Rossby–Haurwitz waves using far fewer collocation points than traditional solvers. Results on Williamson et al. benchmarks show promising accuracy and meshless evaluation benefits, while discussions highlight limitations related to conservation properties and the need for parameterizations in realistic forecasting. The work points to future enhancements, including enforcing conservation laws as hard constraints and exploring parallel-in-time strategies to further reduce training times for operational-scale models.

Abstract

We propose the use of physics-informed neural networks for solving the shallow-water equations on the sphere in the meteorological context. Physics-informed neural networks are trained to satisfy the differential equations along with the prescribed initial and boundary data, and thus can be seen as an alternative approach to solving differential equations compared to traditional numerical approaches such as finite difference, finite volume or spectral methods. We discuss the training difficulties of physics-informed neural networks for the shallow-water equations on the sphere and propose a simple multi-model approach to tackle test cases of comparatively long time intervals. Here we train a sequence of neural networks instead of a single neural network for the entire integration interval. We also avoid the use of a boundary value loss by encoding the boundary conditions in a custom neural network layer. We illustrate the abilities of the method by solving the most prominent test cases proposed by Williamson et al. [J. Comput. Phys. 102 (1992), 211-224].

Physics-informed neural networks for the shallow-water equations on the sphere

TL;DR

The paper demonstrates that physics-informed neural networks can solve the shallow-water equations on the sphere in a meteorological setting, addressing long-time integration challenges with a practical multi-model time-splitting approach and hard-boundary enforcement. By partitioning the integration interval into consecutive subintervals and training a sequence of networks, the method achieves accurate representations of advection, geostrophic balance, mountain-induced flows, and Rossby–Haurwitz waves using far fewer collocation points than traditional solvers. Results on Williamson et al. benchmarks show promising accuracy and meshless evaluation benefits, while discussions highlight limitations related to conservation properties and the need for parameterizations in realistic forecasting. The work points to future enhancements, including enforcing conservation laws as hard constraints and exploring parallel-in-time strategies to further reduce training times for operational-scale models.

Abstract

We propose the use of physics-informed neural networks for solving the shallow-water equations on the sphere in the meteorological context. Physics-informed neural networks are trained to satisfy the differential equations along with the prescribed initial and boundary data, and thus can be seen as an alternative approach to solving differential equations compared to traditional numerical approaches such as finite difference, finite volume or spectral methods. We discuss the training difficulties of physics-informed neural networks for the shallow-water equations on the sphere and propose a simple multi-model approach to tackle test cases of comparatively long time intervals. Here we train a sequence of neural networks instead of a single neural network for the entire integration interval. We also avoid the use of a boundary value loss by encoding the boundary conditions in a custom neural network layer. We illustrate the abilities of the method by solving the most prominent test cases proposed by Williamson et al. [J. Comput. Phys. 102 (1992), 211-224].

Paper Structure

This paper contains 16 sections, 23 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: A five layer densely connected neural network that has 3 hidden layers with 10 units per each layer, and input and output layers with 3 units each.
  • Figure 2: Pictorial representation of the multi-model approach for physics-informed neural networks. Left: Collocation points for a standard single-model physics-informed neural network. Right: Multi-model approach. The total integration interval $[0,12]$ is divided into 3 non-overlapping sub-intervals of equal length and one new physics-informed neural network is trained for each sub-interval sequentially. The initial conditions for the $(i+1)$st model are obtained by evaluating the $i$th model at the right boundary of its time interval. Black points are collocation points for computing $\mathcal{L}_\Delta$, blue points are initial value points for computing $\mathcal{L}_{\rm i}$.
  • Figure 3: Results on the cosine bell advection test case proposed in will92Ay for $\alpha=0$. Top to bottom: Solving this test case with 1, 2, 3 or 4 separate neural networks. The top row thus corresponds to the standard physics-informed neural network case. Left to right: Reconstructions of the initial condition, differences between the initial condition and the solution on day 12, loss functions.
  • Figure 4: Results on the cosine bell advection test case proposed in will92Ay for $\alpha=\pi/2$. Top to bottom: Solution at days $0$, $3$, $6$, $9$ and $12$ as well as difference between the solution at days $0$ and $12$.
  • Figure 5: Results on the nonlinear zonal geostrophic flow test case will92Ay for $\alpha=0$. Left to right: Difference between solution at days 0 and 5 for $h$, $u$ and $v$, whereas $h\in[1000,3000]$, $|\mathbf{v}|\in[0,40]$.
  • ...and 4 more figures