Table of Contents
Fetching ...

OmniLearned: A Foundation Model Framework for All Tasks Involving Jet Physics

Wahid Bhimji, Chris Harris, Vinicius Mikuni, Benjamin Nachman

TL;DR

This work addresses the need for a scalable foundation model in jet physics by introducing OmniLearned, an upgraded PET v2-based framework trained on over one billion jets and paired with a unified data-access software stack. The approach combines classification and generation objectives, via a multi-task loss and a diffusion/flow-matching generative head, to learn rich per-jet representations that transfer across tasks. Key contributions include architectural enhancements (local/global attention with physics-informed biases and multiple task heads), a massive, diverse pretraining dataset with broad labels, and demonstrated state-of-the-art performance on top-quark tagging, b-/c-tagging with ATLAS data, and anomaly detection on CMS open data. The results indicate improved discovery potential across past, current, and future collider experiments, with larger OmniLearned models delivering the strongest gains albeit at higher compute costs, and the framework offering broad applicability beyond jet physics.

Abstract

Foundation models use large datasets to build an effective representation of data that can be deployed on diverse downstream tasks. Previous research developed the OmniLearn foundation model for jet physics, using unique properties of particle physics, and showed that it could significantly advance discovery potential across collider experiments. This paper introduces a major upgrade, resulting in the OmniLearned framework. This framework has three new elements: (1) updates to the model architecture and training, (2) using over one billion jets used for training, and (3) providing well-documented software for accessing all datasets and models. We demonstrate OmniLearned with three representative tasks: top-quark jet tagging with the community Delphes-based benchmark dataset, b-tagging with ATLAS full simulation, and anomaly detection with CMS experimental data. In each case, OmniLearned is the state of the art, further expanding the discovery potential of past, current, and future collider experiments.

OmniLearned: A Foundation Model Framework for All Tasks Involving Jet Physics

TL;DR

This work addresses the need for a scalable foundation model in jet physics by introducing OmniLearned, an upgraded PET v2-based framework trained on over one billion jets and paired with a unified data-access software stack. The approach combines classification and generation objectives, via a multi-task loss and a diffusion/flow-matching generative head, to learn rich per-jet representations that transfer across tasks. Key contributions include architectural enhancements (local/global attention with physics-informed biases and multiple task heads), a massive, diverse pretraining dataset with broad labels, and demonstrated state-of-the-art performance on top-quark tagging, b-/c-tagging with ATLAS data, and anomaly detection on CMS open data. The results indicate improved discovery potential across past, current, and future collider experiments, with larger OmniLearned models delivering the strongest gains albeit at higher compute costs, and the framework offering broad applicability beyond jet physics.

Abstract

Foundation models use large datasets to build an effective representation of data that can be deployed on diverse downstream tasks. Previous research developed the OmniLearn foundation model for jet physics, using unique properties of particle physics, and showed that it could significantly advance discovery potential across collider experiments. This paper introduces a major upgrade, resulting in the OmniLearned framework. This framework has three new elements: (1) updates to the model architecture and training, (2) using over one billion jets used for training, and (3) providing well-documented software for accessing all datasets and models. We demonstrate OmniLearned with three representative tasks: top-quark jet tagging with the community Delphes-based benchmark dataset, b-tagging with ATLAS full simulation, and anomaly detection with CMS experimental data. In each case, OmniLearned is the state of the art, further expanding the discovery potential of past, current, and future collider experiments.

Paper Structure

This paper contains 13 sections, 8 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Neural network architecture used to train OmniLearned. The general model architecture (a) consists of the new PET v2 body (b), input specific embedding (c) and task-specific blocks (d). See the text for more details.
  • Figure 2: Background rejection efficiency for a fixed signal efficiency of $30\%$ in the community top tagging dataset.
  • Figure 3: Receiver Operating Characteristic for b- (left) and c-tagging (right) for different algorithms. The ratio plots show the background rejection improvement compared to GN2 for different background jet classes.
  • Figure 4: Anomaly detection results using the CMS Open Data. Different model sizes (rows) for models trained from scratch (left column) or fine-tuned with OmniLearned (right column) are shown. Different thresholds of the anomaly score, resulting in different data efficiencies, are shown together with the expected sensitivity.
  • Figure 5: Anomaly detection results using the CMS Open Data. The anomaly score is calculated directly from the classes used to pre-train OmniLearned using the ratio between the prediction for 3-prong decays to QCD. Results for the small (left) and medium (right) models are shown.