An Evaluation of Representation Learning Methods in Particle Physics Foundation Models
Michael Chen, Raghav Kansal, Abhijith Gandrakota, Zichun Hao, Jennifer Ngadiuba, Maria Spiropulu
TL;DR
This paper addresses the fragmented evaluation of representation-learning objectives in jet classification by establishing a fixed, unified framework around a ParT-style particle-cloud encoder with standardized preprocessing and sampling. It systematically compares four objective families—self-supervised JetCLR, masked particle modeling (MPM), generative reconstruction (CLIP-VAE), and supervised contrastive (SupCon)—against a strong fully supervised baseline and targeted architectural upgrades. Key findings show that while fully supervised training with the ParT backbone achieves the best overall accuracy, SupCon provides the strongest representation-learning signal, closely matching supervised macro $ROC$-$AUC$ and producing meaningful embeddings; SSL methods lag in several classes, particularly $qq$ and $QCD$. The results establish reproducible baselines and a reference framework for transparent, robust progress in particle-physics foundation models.
Abstract
We present a systematic evaluation of representation learning objectives for particle physics within a unified framework. Our study employs a shared transformer-based particle-cloud encoder with standardized preprocessing, matched sampling, and a consistent evaluation protocol on a jet classification dataset. We compare contrastive (supervised and self-supervised), masked particle modeling, and generative reconstruction objectives under a common training regimen. In addition, we introduce targeted supervised architectural modifications that achieve state-of-the-art performance on benchmark evaluations. This controlled comparison isolates the contributions of the learning objective, highlights their respective strengths and limitations, and provides reproducible baselines. We position this work as a reference point for the future development of foundation models in particle physics, enabling more transparent and robust progress across the community.
