Table of Contents
Fetching ...

Fast Summation on the Sphere with Applications to the Barotropic Vorticity Equation

Anthony Chen, Christiane Jablonowski

TL;DR

This work addresses efficient computation of sphere-based convolutions arising in geophysical fluid dynamics by introducing a kernel-independent spherical tree code built on an iteratively refined icosahedral grid. It couples a Lagrangian discretization of the Barotropic Vorticity Equation with high-order, triangle-based interpolation (SBB) and four interaction modes (PP, PC, CP, CC) under a dual-tree traversal to achieve $O(N\log N)$ complexity. The authors validate the approach on Rossby-Haurwitz waves, Gaussian vortices with AMR, and polar vortex collapse scenarios, demonstrating strong scaling and accuracy comparable to direct sums or kernel-specific methods like BLTC. The results enable high-resolution, long-time simulations on the sphere and point toward extensions to shallow water equations, self-attraction loading effects, and GPU-based acceleration for even larger problems.

Abstract

Fast summation refers to a family of techniques for approximating $O(N^2)$ sums in $O(N\log{N})$ or $O(N)$ time. These techniques have traditionally found wide use in astrophysics and electrostatics in calculating the forces in a $N$-body problem. In this work, we present a spherical tree code, and apply it to the problem of efficiently solving the barotropic vorticity equation.

Fast Summation on the Sphere with Applications to the Barotropic Vorticity Equation

TL;DR

This work addresses efficient computation of sphere-based convolutions arising in geophysical fluid dynamics by introducing a kernel-independent spherical tree code built on an iteratively refined icosahedral grid. It couples a Lagrangian discretization of the Barotropic Vorticity Equation with high-order, triangle-based interpolation (SBB) and four interaction modes (PP, PC, CP, CC) under a dual-tree traversal to achieve complexity. The authors validate the approach on Rossby-Haurwitz waves, Gaussian vortices with AMR, and polar vortex collapse scenarios, demonstrating strong scaling and accuracy comparable to direct sums or kernel-specific methods like BLTC. The results enable high-resolution, long-time simulations on the sphere and point toward extensions to shallow water equations, self-attraction loading effects, and GPU-based acceleration for even larger problems.

Abstract

Fast summation refers to a family of techniques for approximating sums in or time. These techniques have traditionally found wide use in astrophysics and electrostatics in calculating the forces in a -body problem. In this work, we present a spherical tree code, and apply it to the problem of efficiently solving the barotropic vorticity equation.
Paper Structure (19 sections, 42 equations, 11 figures)

This paper contains 19 sections, 42 equations, 11 figures.

Figures (11)

  • Figure 1: The four types of interactions are presented here. The blue particles are the target particles and the red particles are the source particles. When a triangle has many particles, as seen in \ref{['fig:pctri']}, \ref{['fig:cptri']}, and \ref{['fig:cctri']}, we use the proxy points marked with the crosses. $R$ is the distance between the center of the two triangles, $r_t$ is the radius of the target triangle, and $r_s$ is the radius of the source triangle.
  • Figure 2: We observe the speedup that the spherical tree code has over direct summation, with the expected scalings. These are run on one processor, and for the fast summation, we set $\theta=0.7$ and $d=6$, which gives between 3 and 4 digits of accuracy.
  • Figure 3: On the left, we plot the strong scaling of both the direct and fast summation for a fixed problem size of 163842 particles. On the right hand side, we plot the parallel efficiency. For these tests, we are using $\theta=0.7$ and $d=6$.
  • Figure 4: Here, we plot the error of the BLTC and our spherical tree code as we increase the interpolation degree. For the BLTC, we vary the interpolation degree from $d=2$ to $10$ in increments of $2$. For our spherical tree code, we vary the interpolation degree from $d=2$ to $d=14$ in increments of $2$. For the BLTC and the spherical tree code, we have $\theta=0.7$. The runtime for direct summation is given by the black line across the top.
  • Figure 5: On the left, we plot the Rossby Haurwitz wave initial condition with a wave number of $4$. On the right, we plot the relative error behavior as we increase the particle count from 642, corresponding to 8 degree resolution, to 163842, corresponding to 0.5 degree resolution. We do not apply fast summation to the configuration with only 642 particles because there is very little room for speedup. In this case, error is coming from the discretization of the integral, interpolation in the remeshing, and fast summation. Here, the time step is small enough for time stepping error to be negligible. For these runs, we use $\theta=0.7$.
  • ...and 6 more figures