Table of Contents
Fetching ...

Equivariant geometric convolutions for emulation of dynamical systems

Wilson G. Gregory, David W. Hogg, Ben Blum-Smith, Maria Teresa Arias, Kaze W. K. Wong, Soledad Villar

TL;DR

In numerical experiments emulating 2D compressible Navier-Stokes, the ease of enforcing coordinate freedom without making major changes to the model architecture provides an exciting recipe for any CNN-based method applied to an appropriate class of problems.

Abstract

Machine learning methods are increasingly being employed as surrogate models in place of computationally expensive and slow numerical integrators for a bevy of applications in the natural sciences. However, while the laws of physics are relationships between scalars, vectors, and tensors that hold regardless of the frame of reference or chosen coordinate system, surrogate machine learning models are not coordinate-free by default. We enforce coordinate freedom by using geometric convolutions in three model architectures: a ResNet, a Dilated ResNet, and a UNet. In numerical experiments emulating 2D compressible Navier-Stokes, we see better accuracy and improved stability compared to baseline surrogate models in almost all cases. The ease of enforcing coordinate freedom without making major changes to the model architecture provides an exciting recipe for any CNN-based method applied to an appropriate class of problems

Equivariant geometric convolutions for emulation of dynamical systems

TL;DR

In numerical experiments emulating 2D compressible Navier-Stokes, the ease of enforcing coordinate freedom without making major changes to the model architecture provides an exciting recipe for any CNN-based method applied to an appropriate class of problems.

Abstract

Machine learning methods are increasingly being employed as surrogate models in place of computationally expensive and slow numerical integrators for a bevy of applications in the natural sciences. However, while the laws of physics are relationships between scalars, vectors, and tensors that hold regardless of the frame of reference or chosen coordinate system, surrogate machine learning models are not coordinate-free by default. We enforce coordinate freedom by using geometric convolutions in three model architectures: a ResNet, a Dilated ResNet, and a UNet. In numerical experiments emulating 2D compressible Navier-Stokes, we see better accuracy and improved stability compared to baseline surrogate models in almost all cases. The ease of enforcing coordinate freedom without making major changes to the model architecture provides an exciting recipe for any CNN-based method applied to an appropriate class of problems
Paper Structure (28 sections, 11 theorems, 57 equations, 4 figures, 2 tables)

This paper contains 28 sections, 11 theorems, 57 equations, 4 figures, 2 tables.

Key Result

Proposition 1

A function $f:\mathcal{A}_{N,d,k,p}\to \mathcal{A}_{N,d,k',p'}$ is a translation equivariant linear function if and only if it can be written as $\iota_{k}\IfNoValueTF{A \ast C}{}{\qty(A \ast C)}$ for some geometric filter $C \in \mathcal{A}_{M,d,k+k',p\,p'}$. When $N$ is odd, $M=N$, otherwise $M=N+

Figures (4)

  • Figure 1: Examples of geometric images in the natural sciences. (a) A visualization of a temperature map and a polarization map from the ESA Planck Mission planck2015. The color map shows a temperature field (a scalar or ${0}_{(+)}$-tensor) on the sphere, and the whiskers show the principal eigenvector direction of a ${2}_{(+)}$-tensor field in two dimensions. (b) Two-dimensional maps of ocean current (arrows; a vector or ${1}_{(+)}$-tensor field) and ocean salinity (color; a scalar or ${0}_{(+)}$-tensor field) climatedataguide. (c) A three-dimensional map of temperature (a scalar or ${0}_{(+)}$-tensor field) based on sensors distributed throughout the volume of a granary granary. (d) A two-dimensional map of potential vorticity (a pseudoscalar or ${0}_{(-)}$-tensor field) in the Earth's atmosphere, measured for the purposes of predicting storms potentialvorticity.
  • Figure 2: (a) All the filters for $d=2$, $M=3$, $k \in \qty{0,1,2}$. Where there is no symbol in the box the value is zero. There are no $B_d$-isotropic pseudoscalar filters at $d=2, M=3$. (b) Each signed component in the ${2}_{(p)}$-tensor has a particular icon, with the positive diagonal elements represented by the green double arrows, the negative diagonal elements represented by the black double arrows, and the off diagonal elements represented by the petals. Each element rotates in the obvious way, and ${2}_{(+)}$-tensors reflect in the obvious way as well. However, reflections on negative parity diagonal elements flip the sign (color) of the double arrows and have no effect on the petals other than changing their pixel location.
  • Figure 3: (a) Convolution of a scalar image with a scalar and vector filter. (b) Example architecture taking a vector image and scalar image as input and output. Linear layers are shown by the blue convolution arrows followed by green contraction arrows. The black arrows represent nonlinearities. The orange blocks represent multiple channels of images at that tensor order.
  • Figure 4: (a) Five steps of M0.1 rollout using the best performing model, the equivariant UNet without LayerNorm. The x-component of the velocity is plotted. (b) Comparison of test performance over a 15 step rollout on the M0.1 data set. The SMSE is shown for each step, rather than a cumulative loss.

Theorems & Definitions (38)

  • Definition 1: (pseudo-)scalars
  • Definition 2: (pseudo-)vectors
  • Definition 3: ${k}_{(p)}$-tensors
  • Definition 4: Einstein summation notation
  • Definition 5: tensor product
  • Definition 6: $k$-contraction
  • Definition 7: $\ell_2$ tensor norm
  • Definition 8: geometric image
  • Definition 9: geometric convolution
  • Definition 10: $\text{max\,pool}_b$
  • ...and 28 more