Neural Network Quantum Field Theory from Transformer Architectures

Dmitry S. Ageev; Yulia A. Ageeva

Neural Network Quantum Field Theory from Transformer Architectures

Dmitry S. Ageev, Yulia A. Ageeva

TL;DR

The paper proposes a neural-network quantum-field-theory (NN-QFT) construction of Euclidean scalar fields using transformer attention heads, defining $n$-point correlators by averaging over random parameter ensembles. A single head with shared softmax weights yields non-Gaussian statistics that persist in the infinite-width limit $d_k\to\infty$, with a finite independence-breaking contribution to the connected four-point function arising as a covariance over the query--key weights; Euclidean-invariant kernels can be engineered via random-feature token embeddings. By aggregating $N_h$ independent heads with the standard $1/N_h$ variance normalization, connected non-Gaussian correlators are suppressed as $\mathcal{O}(1/N_h)$ and vanish as $N_h\to\infty$, yielding a Gaussian NN-QFT in the large-head limit. This work links transformer architectures to QFT-like behavior, showing how non-Gaussianity can appear at the single-head level and be washed out by multi-head averaging, with potential for constructing interacting actions within the NN-QFT framework.

Abstract

We propose a neural-network construction of Euclidean scalar quantum field theories from transformer attention heads, defining $n$-point correlators by averaging over random network parameters in the NN-QFT framework. For a single attention head, shared random softmax weights couple different width coordinates and induce non-Gaussian field statistics that persist in the infinite-width limit $d_k\to\infty$. We compute the two-point function in an attention-weight representation and show how Euclidean-invariant kernels can be engineered via random-feature token embeddings. We then analyze the connected four-point function and identify an "independence-breaking" contribution, expressible as a covariance over query-key weights, which remains finite at infinite width. Finally, we show that summing many independent heads with standard $1/N_h$ normalization suppresses connected non-Gaussian correlators as $1/N_h$, yielding a Gaussian NN-QFT in the large-head limit.

Neural Network Quantum Field Theory from Transformer Architectures

TL;DR

The paper proposes a neural-network quantum-field-theory (NN-QFT) construction of Euclidean scalar fields using transformer attention heads, defining

-point correlators by averaging over random parameter ensembles. A single head with shared softmax weights yields non-Gaussian statistics that persist in the infinite-width limit

, with a finite independence-breaking contribution to the connected four-point function arising as a covariance over the query--key weights; Euclidean-invariant kernels can be engineered via random-feature token embeddings. By aggregating

independent heads with the standard

variance normalization, connected non-Gaussian correlators are suppressed as

and vanish as

, yielding a Gaussian NN-QFT in the large-head limit. This work links transformer architectures to QFT-like behavior, showing how non-Gaussianity can appear at the single-head level and be washed out by multi-head averaging, with potential for constructing interacting actions within the NN-QFT framework.

Abstract

We propose a neural-network construction of Euclidean scalar quantum field theories from transformer attention heads, defining

-point correlators by averaging over random network parameters in the NN-QFT framework. For a single attention head, shared random softmax weights couple different width coordinates and induce non-Gaussian field statistics that persist in the infinite-width limit

. We compute the two-point function in an attention-weight representation and show how Euclidean-invariant kernels can be engineered via random-feature token embeddings. We then analyze the connected four-point function and identify an "independence-breaking" contribution, expressible as a covariance over query-key weights, which remains finite at infinite width. Finally, we show that summing many independent heads with standard

normalization suppresses connected non-Gaussian correlators as

, yielding a Gaussian NN-QFT in the large-head limit.

Paper Structure (5 sections, 48 equations)

This paper contains 5 sections, 48 equations.

Introduction
Scalar field from single-head transformer
Four-point function and non-Gaussianity from shared attention weights within single-head transformer
Many heads and Gaussian NN-QFT
Conclusion

Neural Network Quantum Field Theory from Transformer Architectures

TL;DR

Abstract

Neural Network Quantum Field Theory from Transformer Architectures

Authors

TL;DR

Abstract

Table of Contents