Table of Contents
Fetching ...

Event Tokenization and Next-Token Prediction for Anomaly Detection at the Large Hadron Collider

Ambre Visive, Polina Moskvitina, Clara Nellist, Roberto Ruiz de Austri, Sascha Caron

TL;DR

The paper presents a novel use of encoder-based, LLM-like networks trained on background collider events to perform unsupervised anomaly detection via masked-token reconstruction. Collider events are tokenized into sequences and learned by a two-layer Transformer, with reconstruction scores used to identify deviations from learned background distributions, demonstrated on a four-top-quark production benchmark. The method achieves a ROC-AUC of about 0.67 and is competitive with certain unsupervised approaches while not surpassing the best DDD-based methods, highlighting its potential for model-independent searches. This approach offers a flexible, data-driven pathway for revealing subtle discrepancies in LHC data and could enhance searches for new physics without explicit signal modeling.

Abstract

We propose a novel use of Large Language Models (LLMs) as unsupervised anomaly detectors in particle physics. Using lightweight LLM-like networks with encoder-based architectures trained to reconstruct background events via masked-token prediction, our method identifies anomalies through deviations in reconstruction performance, without prior knowledge of signal characteristics. Applied to searches for simultaneous four-top-quark production, this token-based approach shows competitive performance against established unsupervised methods and effectively captures subtle discrepancies in collider data, suggesting a promising direction for model-independent searches for new physics.

Event Tokenization and Next-Token Prediction for Anomaly Detection at the Large Hadron Collider

TL;DR

The paper presents a novel use of encoder-based, LLM-like networks trained on background collider events to perform unsupervised anomaly detection via masked-token reconstruction. Collider events are tokenized into sequences and learned by a two-layer Transformer, with reconstruction scores used to identify deviations from learned background distributions, demonstrated on a four-top-quark production benchmark. The method achieves a ROC-AUC of about 0.67 and is competitive with certain unsupervised approaches while not surpassing the best DDD-based methods, highlighting its potential for model-independent searches. This approach offers a flexible, data-driven pathway for revealing subtle discrepancies in LHC data and could enhance searches for new physics without explicit signal modeling.

Abstract

We propose a novel use of Large Language Models (LLMs) as unsupervised anomaly detectors in particle physics. Using lightweight LLM-like networks with encoder-based architectures trained to reconstruct background events via masked-token prediction, our method identifies anomalies through deviations in reconstruction performance, without prior knowledge of signal characteristics. Applied to searches for simultaneous four-top-quark production, this token-based approach shows competitive performance against established unsupervised methods and effectively captures subtle discrepancies in collider data, suggesting a promising direction for model-independent searches for new physics.

Paper Structure

This paper contains 9 sections, 1 equation, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Illustrative distribution of the aggregated reconstruction scores in a perfectly trained model, evaluated with sparse categorical cross-entropy, for background (blue) and signal (cyan) events. The red dashed line indicates the optimal threshold used to separate the two classes.
  • Figure 2: Results from the LLM-like method and comparisons to alternative methods.