Learning the Simplicity of Scattering Amplitudes

Clifford Cheung; Aurélien Dersy; Matthew D. Schwartz

Learning the Simplicity of Scattering Amplitudes

Clifford Cheung, Aurélien Dersy, Matthew D. Schwartz

TL;DR

This work demonstrates that encoder-decoder transformers can radically simplify spinor-helicity scattering amplitudes by translating long, identity-rich expressions into compact, physically meaningful forms. A two-stage approach couples one-shot simplification for short expressions with a sequential pipeline guided by contrastive embeddings to tackle lengthy amplitudes, enabling reductions from hundreds of terms to Parke-Taylor-like monomials or other concise formulas. The data-creation strategy uses backward generation and spinor identities to produce diverse training pairs, and the combination of beam-search and nucleus-sampling at inference, plus a contrastive-embedding stage, yields high accuracy on moderate-length inputs and dramatic reductions for long expressions. The results include exact Parke-Taylor reductions for four- and five-point amplitudes and novel compact expressions involving scalars and gravitons, illustrating ML-based symbolic manipulation as a practical tool in high-energy theory with potential broad applicability to momentum twistors and beyond.

Abstract

The simplification and reorganization of complex expressions lies at the core of scientific progress, particularly in theoretical high-energy physics. This work explores the application of machine learning to a particular facet of this challenge: the task of simplifying scattering amplitudes expressed in terms of spinor-helicity variables. We demonstrate that an encoder-decoder transformer architecture achieves impressive simplification capabilities for expressions composed of handfuls of terms. Lengthier expressions are implemented in an additional embedding network, trained using contrastive learning, which isolates subexpressions that are more likely to simplify. The resulting framework is capable of reducing expressions with hundreds of terms - a regular occurrence in quantum field theory calculations - to vastly simpler equivalent expressions. Starting from lengthy input expressions, our networks can generate the Parke-Taylor formula for five-point gluon scattering, as well as new compact expressions for five-point amplitudes involving scalars and gravitons. An interactive demonstration can be found at https://spinorhelicity.streamlit.app .

Learning the Simplicity of Scattering Amplitudes

TL;DR

Abstract

Paper Structure (23 sections, 43 equations, 15 figures, 4 tables)

This paper contains 23 sections, 43 equations, 15 figures, 4 tables.

Introduction
Notation and training data
Spinor-helicity formalism
Target data set
Input data set
Analytic simplification
One-shot learning
Network architecture
Results
Embedding analysis
Sequential simplification
Contrastive learning
Grouping terms
Simplifying long expressions
Physical amplitudes
...and 8 more sections

Figures (15)

Figure 1: Spinor-helicity expressions are simplified in several steps. To start, individual terms are projected into an embedding space (grey sphere). Using contrastive learning, we train a "projection" transformer encoder to learn a mapping that groups similar terms close to one another in the embedding space. After identifying similar terms we use a "simplify" transformer encoder-decoder to predict the corresponding simple form. After simplifying all distinct groups, this procedure is repeated with the resulting expression, iterating until no further simplification is possible.
Figure 2: Performance on the held-out test set for the one-shot simplification of complicated $n$-point spinor-helicity amplitudes of up to 1k tokens. We compare the length of the model prediction against the length of the target. Green shows when the network reduces an expression to the correct target length while blue shows when the network simplifies beyond the target, which is possible for four-point amplitudes since they are highly redundant. The results are reported for different beam sizes used at inference, where only the shortest hypothesis is retained.
Figure 3: Accuracy on the held-out test set for the one-shot simplification of complicated $n$-point spinor-helicity amplitudes. A model-generated amplitude is deemed accurate if it is both numerically equivalent and simpler than the input amplitude. We compare the accuracy based on the number of distinct terms in the numerator of the target amplitude (top row) and based on the number of identities used to scramble it (bottom row).
Figure 4: Accuracy on the held-out test set for the one-shot simplification of complicated five-point spinor-helicity amplitudes. We compare models that have seen amplitudes scrambled up to three times at most during training (blue) to models trained on up to five scrambles (orange).
Figure 5: t-SNE visualization of 5k input amplitude embeddings. Each amplitude embedding is obtained using the transformer encoder, averaging over all constituent word embeddings. The points are color-coded according to the number of distinct terms in the numerator of the corresponding input amplitude.
...and 10 more figures

Learning the Simplicity of Scattering Amplitudes

TL;DR

Abstract

Learning the Simplicity of Scattering Amplitudes

Authors

TL;DR

Abstract

Table of Contents

Figures (15)