Table of Contents
Fetching ...

GAN-based data augmentation for rare and exotic hadron searches in Pb--Pb collisions in ALICE

Anisa Khatun

TL;DR

This work tackles the challenge of limited statistics and high computational cost in searching for rare heavy-flavour and exotic hadrons in Pb–Pb collisions within ALICE. It demonstrates a GAN-based data augmentation approach trained on reconstructed observables from augmented MC to generate synthetic signal samples for the rare decay $Ξ_{c}^{+} \rightarrow Ξ^{-} + π^{+} + π^{+}$ without additional full detector simulations. The GANs are shown to reproduce marginal distributions and multidimensional correlations, with KS tests indicating statistical compatibility and stable training behavior. Taken together, the method offers a scalable pathway to enhance rare-signal sensitivity in the ALICE heavy-flavour programme and can be extended to other exotic states and decay topologies.

Abstract

This work presents a feasibility study aimed at enhancing the reconstruction sensitivity for rare heavy-flavour hadrons in Pb-Pb collisions in the ALICE experiment, using the $Ξ_{c}^{+}$ baryon as a benchmark. The $Ξ_{c}^{+}$ baryon has a low rate of production and some complex decay topologies as for instance the decay $Ξ_{c}^{+} \rightarrow Ξ^{-} + π^{+} + π^{+}$ considered in this work. Traditional simulation workflows involving event embedding and full detector response are computationally expensive and statistically limited, especially for rare signals. This study represents the first exploration of generative models within the heavy-flavour programme of ALICE. It uses a dataset of reconstructed physics quantities, such as momenta, positions, and decay vertex coordinates of $Ξ_{c}^{+}$ decay products in Pb-Pb collisions as input features, derived from augmented ALICE Monte Carlo simulations. Such features will serve as a training set for Generative Adversarial Networks (GANs) designed to generate statistically significant synthetic signal samples without the need for additional full simulations. While $Ξ_{c}^{+}$ serves as a benchmark, the broader objective is to enable searches for exotic heavy-flavour hadrons or other exotic states with complex decay patterns. By leveraging GAN-based augmentation, this approach supports rare-signal extraction in computationally demanding analyses and opens the way to broader applications of generative models in the ALICE heavy-flavour programme.

GAN-based data augmentation for rare and exotic hadron searches in Pb--Pb collisions in ALICE

TL;DR

This work tackles the challenge of limited statistics and high computational cost in searching for rare heavy-flavour and exotic hadrons in Pb–Pb collisions within ALICE. It demonstrates a GAN-based data augmentation approach trained on reconstructed observables from augmented MC to generate synthetic signal samples for the rare decay without additional full detector simulations. The GANs are shown to reproduce marginal distributions and multidimensional correlations, with KS tests indicating statistical compatibility and stable training behavior. Taken together, the method offers a scalable pathway to enhance rare-signal sensitivity in the ALICE heavy-flavour programme and can be extended to other exotic states and decay topologies.

Abstract

This work presents a feasibility study aimed at enhancing the reconstruction sensitivity for rare heavy-flavour hadrons in Pb-Pb collisions in the ALICE experiment, using the baryon as a benchmark. The baryon has a low rate of production and some complex decay topologies as for instance the decay considered in this work. Traditional simulation workflows involving event embedding and full detector response are computationally expensive and statistically limited, especially for rare signals. This study represents the first exploration of generative models within the heavy-flavour programme of ALICE. It uses a dataset of reconstructed physics quantities, such as momenta, positions, and decay vertex coordinates of decay products in Pb-Pb collisions as input features, derived from augmented ALICE Monte Carlo simulations. Such features will serve as a training set for Generative Adversarial Networks (GANs) designed to generate statistically significant synthetic signal samples without the need for additional full simulations. While serves as a benchmark, the broader objective is to enable searches for exotic heavy-flavour hadrons or other exotic states with complex decay patterns. By leveraging GAN-based augmentation, this approach supports rare-signal extraction in computationally demanding analyses and opens the way to broader applications of generative models in the ALICE heavy-flavour programme.
Paper Structure (6 sections, 6 figures)

This paper contains 6 sections, 6 figures.

Figures (6)

  • Figure 1: $\Xi_{\mathrm{c}}^{+}$ decay chain.
  • Figure 2: Schematic representation of the Generative Adversarial Network (GAN) architecture used in this study. The generator produces synthetic reconstructed features starting from random noise, while the discriminator attempts to distinguish generated samples from real ALICE Monte Carlo data.
  • Figure 3: Comparison of reconstructed feature distributions between GAN-generated samples and ALICE Monte Carlo at the beginning of the training.
  • Figure 4: Comparison of one-dimensional reconstructed feature distributions between GAN-generated samples and ALICE Monte Carlo after training. The corresponding Kolmogorov--Smirnov p-values quantify the statistical compatibility between the two samples for each observable.
  • Figure 5: Two-dimensional scatter plots illustrating correlations between selected reconstructed observables for GAN-generated samples and ALICE MC.
  • ...and 1 more figures