Table of Contents
Fetching ...

Latent Representation Learning in Heavy-Ion Collisions with MaskPoint Transformer

Jing-Zong Zhang, Shuang Guo, Li-Lin Zhu, Lingxiao Wang, Guo-Liang Ma

TL;DR

The paper addresses the challenge of extracting informative features from the high-dimensional final-state data produced in heavy-ion collisions. It introduces MaskPoint, a Transformer-based autoencoder trained in a two-stage regime: self-supervised pretraining on unlabeled AMPT events to learn latent representations, followed by supervised finetuning with a lightweight classifier for collision-system identification. The method achieves significantly higher accuracy than a PointNet baseline and, through PCA and SHAP analyses, reveals that the learned features encode nonlinear relationships with physical observables beyond what is captured by individual variables. This approach provides a general, robust foundation for AI-driven discovery in quark–gluon plasma studies and related emergent phenomena, enabling more powerful and interpretable analyses of high-energy nuclear collision data.

Abstract

A central challenge in high-energy nuclear physics is to extract informative features from the high-dimensional final-state data of heavy-ion collisions (HIC) in order to enable reliable downstream analyses. Traditional approaches often rely on selected observables, which may miss subtle but physically relevant structures in the data. To address this, we introduce a Transformer-based autoencoder trained with a two-stage paradigm: self-supervised pre-training followed by supervised fine-tuning. The pretrained encoder learns latent representations directly from unlabeled HIC data, providing a compact and information-rich feature space that can be adapted to diverse physics tasks. As a case study, we apply the method to distinguish between large and small collision systems, where it achieves significantly higher classification accuracy than PointNet. Principal component analysis and SHAP interpretation further demonstrate that the autoencoder captures complex nonlinear correlations beyond individual observables, yielding features with strong discriminative and explanatory power. These results establish our two-stage framework as a general and robust foundation for feature learning in HIC, opening the door to more powerful analyses of quark--gluon plasma properties and other emergent phenomena. The implementation is publicly available at https://github.com/Giovanni-Sforza/MaskPoint-AMPT.

Latent Representation Learning in Heavy-Ion Collisions with MaskPoint Transformer

TL;DR

The paper addresses the challenge of extracting informative features from the high-dimensional final-state data produced in heavy-ion collisions. It introduces MaskPoint, a Transformer-based autoencoder trained in a two-stage regime: self-supervised pretraining on unlabeled AMPT events to learn latent representations, followed by supervised finetuning with a lightweight classifier for collision-system identification. The method achieves significantly higher accuracy than a PointNet baseline and, through PCA and SHAP analyses, reveals that the learned features encode nonlinear relationships with physical observables beyond what is captured by individual variables. This approach provides a general, robust foundation for AI-driven discovery in quark–gluon plasma studies and related emergent phenomena, enabling more powerful and interpretable analyses of high-energy nuclear collision data.

Abstract

A central challenge in high-energy nuclear physics is to extract informative features from the high-dimensional final-state data of heavy-ion collisions (HIC) in order to enable reliable downstream analyses. Traditional approaches often rely on selected observables, which may miss subtle but physically relevant structures in the data. To address this, we introduce a Transformer-based autoencoder trained with a two-stage paradigm: self-supervised pre-training followed by supervised fine-tuning. The pretrained encoder learns latent representations directly from unlabeled HIC data, providing a compact and information-rich feature space that can be adapted to diverse physics tasks. As a case study, we apply the method to distinguish between large and small collision systems, where it achieves significantly higher classification accuracy than PointNet. Principal component analysis and SHAP interpretation further demonstrate that the autoencoder captures complex nonlinear correlations beyond individual observables, yielding features with strong discriminative and explanatory power. These results establish our two-stage framework as a general and robust foundation for feature learning in HIC, opening the door to more powerful analyses of quark--gluon plasma properties and other emergent phenomena. The implementation is publicly available at https://github.com/Giovanni-Sforza/MaskPoint-AMPT.

Paper Structure

This paper contains 10 sections, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Architecture of the masked autoencoder. The input point cloud is first partitioned into patches by PointNet and then encoded by a Transformer. During pre-training, the encoder output is passed to a Transformer decoder for an discrimination task between real and fake particles. For fine-tuning and analysis, only the pretrained encoder is retained as the feature extractor.
  • Figure 2: PCA projection of the latent features learned by the autoencoder during pre-training with 3D momentum inputs, shown in the PC1–PC2 plane. The two colors denote two different systems. The clear clustering indicates that the model, even without labels, captures and distinguishes intrinsic physical differences between the two systems.
  • Figure 3: Classification accuracy between large and small systems across $N_{ch}$ bins for the fine-tuned autoencoder (black) and PointNet (red). The autoencoder consistently outperforms PointNet, validating the effectiveness of the “self-supervised pre-training + supervised fine-tuning” strategy.
  • Figure 4: Distributions of PC1 from the autoencoder (top) and PointNet (middle), compared with $\sigma_{\eta}$ (bottom). The overlap between Pb+Pb and p+Pb is 0.27% for the autoencoder, lower than PointNet (2.42%) and $\sigma_{\eta}$ (2.71%). While PointNet approaches the theoretical limit of $\sigma_{\eta}$, the autoencoder surpasses it, demonstrating stronger discriminative power from self-supervised pre-training.
  • Figure 5: Correlation coefficients between PCs and observables, with SHAP contributions to PC1. Autoencoder PC1 shows near-zero linear correlations but a high SHAP weight from $\sigma_{\eta}$, indicating that it encodes key information in a non-linear manner, which explains its superior performance.