Table of Contents
Fetching ...

Multi-Lattice Sampling of Quantum Field Theories via Neural Operator-based Flows

Bálint Máté, François Fleuret

TL;DR

This work proposes to approximate a time-dependent neural operator whose time integral provides a mapping between the functional distributions of the free and target theories, and experimentally validate the proposal on the 2-dimensional φ4-theory.

Abstract

We consider the problem of sampling lattice field configurations on a lattice from the Boltzmann distribution corresponding to some action. Since such densities arise as approximationw of an underlying functional density, we frame the task as an instance of operator learning. We propose to approximate a time-dependent neural operator whose time integral provides a mapping between the functional distributions of the free and target theories. Once a particular lattice is chosen, the neural operator can be discretized to a finite-dimensional, time-dependent vector field which in turn induces a continuous normalizing flow between finite dimensional distributions over the chosen lattice. This flow can then be trained to be a diffeormorphism between the discretized free and target theories on the chosen lattice, and, by construction, can be evaluated on different discretizations of spacetime. We experimentally validate the proposal on the 2-dimensional $φ^4$-theory to explore to what extent such operator-based flow architectures generalize to lattice sizes they were not trained on, and show that pretraining on smaller lattices can lead to a speedup over training directly on the target lattice size.

Multi-Lattice Sampling of Quantum Field Theories via Neural Operator-based Flows

TL;DR

This work proposes to approximate a time-dependent neural operator whose time integral provides a mapping between the functional distributions of the free and target theories, and experimentally validate the proposal on the 2-dimensional φ4-theory.

Abstract

We consider the problem of sampling lattice field configurations on a lattice from the Boltzmann distribution corresponding to some action. Since such densities arise as approximationw of an underlying functional density, we frame the task as an instance of operator learning. We propose to approximate a time-dependent neural operator whose time integral provides a mapping between the functional distributions of the free and target theories. Once a particular lattice is chosen, the neural operator can be discretized to a finite-dimensional, time-dependent vector field which in turn induces a continuous normalizing flow between finite dimensional distributions over the chosen lattice. This flow can then be trained to be a diffeormorphism between the discretized free and target theories on the chosen lattice, and, by construction, can be evaluated on different discretizations of spacetime. We experimentally validate the proposal on the 2-dimensional -theory to explore to what extent such operator-based flow architectures generalize to lattice sizes they were not trained on, and show that pretraining on smaller lattices can lead to a speedup over training directly on the target lattice size.
Paper Structure (14 sections, 16 equations, 10 figures, 2 tables)

This paper contains 14 sections, 16 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Schematic overview of the probability distributions of interest. The top row shows the functional distributions of the free theory and the target theory connected by the time dependent operator $\mathcal{V}_t$. Moving to the bottom row corresponds to approximating each object from the top row on a discrete lattice. In particular, in the bottom row all objects are finite dimensional, well-defined and can be worked with numerically.
  • Figure 2: An operator that maps an initial condition $u(0,x)$ (top row) to its time-evolved state $u(\Delta t,x)$ (bottom row), where the time evolution is given by the heat equation $\Delta u = \partial_t u$. The blue dots denote the evaluation of $u(0,x)$ on a discrete mesh, while the orange dots denote the output of the operator (a convolution in this case) evaluated on that same mesh. As the mesh gets denser, the operator becomes a better approximation of the map between the continuous $u(0,x)$ (blue curve) and $u(\Delta t,x)$ (orange curve). For our application, we will be interested in the time evolution of probability density of fields (corresponding to $u$ in the plots), along the time interval $[0,1]$ connecting the free theory to a interacting theory.
  • Figure 3: Sketch of the architecture. (a) Pointwise field embedding via $f_{\theta_1}$, continuous convolution with kernel $\tilde{K}_{\theta_2}$ to aggregate local information, and combination of the pointwise field values with neighborhood information via $\tau_{\theta_3}$. (b) Contracting the channels with learnable time-dependent weights given by $\kappa_{\theta_4}(t)$. (c) Averaging of the preceding steps over the sign of $\phi$ to enforce the $\mathbb Z_2$-symmetry of the theory.
  • Figure 4: Particle in double well potential (§ \ref{['sec:exp1']}). $ESS, \langle M \rangle, \langle |M| \rangle$ computed from $16384$ samples at different lattice sizes. The blue crosses correspond to lattice sizes that the model was trained on, while orange dots denote lattice sizes unseen by the network during training. Note that the absolute magnetization converges to a value of $1.30$ as the lattice size is increased to $16$ and stabilizes at this value at larger lattice sizes.
  • Figure 5: Particle in double well potential (§ \ref{['sec:exp1']}). The two-point correlation function $G(x,y)$ computed from $16384$ samples on lattices the model was trained on (left) and on lattices the model was not trained on (right). Because of the symmetries of the task the correlation function only depends on the distance $r = |x-y|$, thus the function $G(r)$ is plotted. Similarly to the absolute value of the magnetization, $G(r)$ approaches the continuum limit as the model is evaluated at increasing lattice sizes.
  • ...and 5 more figures