Learning Symmetry-Independent Jet Representations via Jet-Based Joint Embedding Predictive Architecture
Subash Katel, Haoyang Li, Zihan Zhao, Raghav Kansal, Farouk Mokhtar, Javier Duarte
TL;DR
The paper addresses the challenge of training jet-related models when labeled data is scarce or mismatched by introducing J-JEPA, a self-supervised, augmentation-free pretraining framework that predicts target-subjet representations from context-subjet representations using target positions as hints, with the target encoder stabilized via EMA and predictions made in representation space using $L_2$ loss. The approach enables cross-task applicability by removing the need for hand-crafted augmentations tailored to each downstream task, and it demonstrates that pretrained representations outperform randomly initialized baselines for jet tagging, especially under limited labeled data. Key contributions include the physical positional encoding, two embedding strategies for subjets, and a masking scheme inspired by I-JEPA, all validated on JetClass pretraining and Top Tagging finetuning. The findings suggest J-JEPA is a scalable path toward large-scale, cross-task foundation models in jet physics, potentially reducing reliance on labeled simulations and enabling robust transfer to real data.
Abstract
In high energy physics, self-supervised learning (SSL) methods have the potential to aid in the creation of machine learning models without the need for labeled datasets for a variety of tasks, including those related to jets -- narrow sprays of particles produced by quarks and gluons in high energy particle collisions. This study introduces an approach to learning jet representations without hand-crafted augmentations using a jet-based joint embedding predictive architecture (J-JEPA), which aims to predict various physical targets from an informative context. As our method does not require hand-crafted augmentation like other common SSL techniques, J-JEPA avoids introducing biases that could harm downstream tasks. Since different tasks generally require invariance under different augmentations, this training without hand-crafted augmentation enables versatile applications, offering a pathway toward a cross-task foundation model. We finetune the representations learned by J-JEPA for jet tagging and benchmark them against task-specific representations.
