Pretrained Event Classification Model for High Energy Physics Analysis
Joshua Ho, Benjamin Ryan Roberts, Shuo Han, Haichen Wang
TL;DR
This work develops a collider event foundation model built on a Graph Neural Network and pretrained on 120 million simulated proton–proton collision events across 12 physics processes. By training with multiclass or multilabel objectives, the model learns transferable event representations that can be fine-tuned for downstream classification tasks, yielding notable accuracy gains in low-data regimes and comparable performance with abundant data. The study also analyzes latent representations using Centered Kernel Alignment, finding meaningful differences between pretrained-finetuned and from-scratch models, which supports the robustness and distinctness of the pretrained representations. Additionally, the framework demonstrates computational efficiency, as fine-tuning on multiple tasks substantially reduces total training time compared to training from scratch, suggesting scalable benefits for large-scale HEP analyses.
Abstract
We introduce a foundation model for event classification in high-energy physics, built on a Graph Neural Network architecture and trained on 120 million simulated proton-proton collision events spanning 12 distinct physics processes. The model is pretrained to learn a general and robust representation of collision data using challenging multiclass and multilabel classification tasks. Its performance is evaluated across five event classification tasks, which include both physics processes used during pretraining and new processes not encountered during pretraining. Fine-tuning the pretrained model significantly improves classification performance, particularly in scenarios with limited training data, demonstrating gains in both accuracy and computational efficiency. To investigate the underlying mechanisms behind these performance improvements, we employ a representational similarity evaluation framework based on Centered Kernel Alignment. This analysis reveals notable differences in the learned representations of fine-tuned pretrained models compared to baseline models trained from scratch.
