Table of Contents
Fetching ...

Transforming the Bootstrap: Using Transformers to Compute Scattering Amplitudes in Planar N = 4 Super Yang-Mills Theory

Tianji Cai, Garrett W. Merz, François Charton, Niklas Nolte, Matthias Wilhelm, Kyle Cranmer, Lance J. Dixon

TL;DR

This work shows that transformers can be applied successfully to problems in theoretical physics that require exact solutions.

Abstract

We pursue the use of deep learning methods to improve state-of-the-art computations in theoretical high-energy physics. Planar N = 4 Super Yang-Mills theory is a close cousin to the theory that describes Higgs boson production at the Large Hadron Collider; its scattering amplitudes are large mathematical expressions containing integer coefficients. In this paper, we apply Transformers to predict these coefficients. The problem can be formulated in a language-like representation amenable to standard cross-entropy training objectives. We design two related experiments and show that the model achieves high accuracy (> 98%) on both tasks. Our work shows that Transformers can be applied successfully to problems in theoretical physics that require exact solutions.

Transforming the Bootstrap: Using Transformers to Compute Scattering Amplitudes in Planar N = 4 Super Yang-Mills Theory

TL;DR

This work shows that transformers can be applied successfully to problems in theoretical physics that require exact solutions.

Abstract

We pursue the use of deep learning methods to improve state-of-the-art computations in theoretical high-energy physics. Planar N = 4 Super Yang-Mills theory is a close cousin to the theory that describes Higgs boson production at the Large Hadron Collider; its scattering amplitudes are large mathematical expressions containing integer coefficients. In this paper, we apply Transformers to predict these coefficients. The problem can be formulated in a language-like representation amenable to standard cross-entropy training objectives. We design two related experiments and show that the model achieves high accuracy (> 98%) on both tasks. Our work shows that Transformers can be applied successfully to problems in theoretical physics that require exact solutions.
Paper Structure (19 sections, 25 equations, 9 figures, 9 tables)

This paper contains 19 sections, 25 equations, 9 figures, 9 tables.

Figures (9)

  • Figure 1: Sample Feynman diagrams for the process $gg \rightarrow Hg$ at two loops (left) and eight loops (right) in QCD. The same diagrams contribute in SYM, where the Higgs boson $H$ and top quark ($t$) triangle is replaced by a particular local operator in the theory, and the process is referred to as a form factor.
  • Figure 2: Histograms of the symbol coefficients for the three-gluon form factor at 4, 5, and 6 loops. The horizontal axis is the base 10 logarithm of the magnitude of the coefficient. The vertical axis is the (arbitrarily normalized) frequency with which coefficient magnitudes occur in the form factor.
  • Figure 3: Accuracy vs. epoch on the nonzero coefficient-from-key task at loop $L=5$ (left) and $L=6$ (right), for four model initializations shown in different colors. The bottom plots show the balance of predicted signs vs. epoch, with $+$ ($-$) indicating $100\%$ ($0\%$) positive signs. Initially the model fluctuates between strongly favoring one sign or the other before more accurately predicting the mix of signs for individual terms.
  • Figure 4: Relation accuracy (red), magnitude accuracy (blue), sign accuracy (yellow), and coefficient accuracy (green) for each of the named relations, grouped by behavior. Relations in Group 1 (left column) are two-term equivalence relations that are consistently satisfied after only a few epochs. Relations in Group 2 (center column) are relations that require at least two coefficients to have different signs, and are not satisfied until the second phase. Relations in Group 3 (right column) are mixed relations, for which some instances decompose into pairs of equivalent terms (as in Group 1 relations) while others do not.
  • Figure 5: (Left) The leading three PCA components of token embeddings for a 2-layer Transformer with $d=512$ trained for 50 epochs on $L=5$ data, with zeros included. The leading three PCA components explain $63.56\%$ of variance, and dihedral symmetry is not visible. (Right) The leading three PCA components of token embeddings for a 2-layer Transformer with $d=512$ trained for 200 epochs on $L=6$ data, with zeros included. The leading three PCA components explain $81.76\%$ of variance. The octahedron exhibits dihedral symmetry.
  • ...and 4 more figures