Table of Contents
Fetching ...

An interpretable unsupervised representation learning for high precision measurement in particle physics

Xing-Jian Lv, De-Xing Miao, Zi-Jun Xu, Jian-Chun Wang

TL;DR

This work tackles the interpretability gap in unsupervised representations for precision measurements in particle physics. It introduces the Histogram AutoEncoder (HistoAE), which uses a histogram-based loss (HistoLoss) to enforce a 2D latent space with clear physical meaning: charge and impact position. Applied to silicon microstrip detector beam-test data, HistoAE achieves charge resolution around 0.25 e and position resolution near 3 μm, matching conventional methods while operating in an entirely unsupervised framework. Additionally, the decoder serves as a fast generator for detector responses, enabling rapid simulations and future extensions to more complex latent spaces.

Abstract

Unsupervised learning has been widely applied to various tasks in particle physics. However, existing models lack precise control over their learned representations, limiting physical interpretability and hindering their use for accurate measurements. We propose the Histogram AutoEncoder (HistoAE), an unsupervised representation learning network featuring a custom histogram-based loss that enforces a physically structured latent space. Applied to silicon microstrip detectors, HistoAE learns an interpretable two-dimensional latent space corresponding to the particle's charge and impact position. After simple post-processing, it achieves a charge resolution of $0.25\,e$ and a position resolution of $3\,μ\mathrm{m}$ on beam-test data, comparable to the conventional approach. These results demonstrate that unsupervised deep learning models can enable physically meaningful and quantitatively precise measurements. Moreover, the generative capacity of HistoAE enables straightforward extensions to fast detector simulations.

An interpretable unsupervised representation learning for high precision measurement in particle physics

TL;DR

This work tackles the interpretability gap in unsupervised representations for precision measurements in particle physics. It introduces the Histogram AutoEncoder (HistoAE), which uses a histogram-based loss (HistoLoss) to enforce a 2D latent space with clear physical meaning: charge and impact position. Applied to silicon microstrip detector beam-test data, HistoAE achieves charge resolution around 0.25 e and position resolution near 3 μm, matching conventional methods while operating in an entirely unsupervised framework. Additionally, the decoder serves as a fast generator for detector responses, enabling rapid simulations and future extensions to more complex latent spaces.

Abstract

Unsupervised learning has been widely applied to various tasks in particle physics. However, existing models lack precise control over their learned representations, limiting physical interpretability and hindering their use for accurate measurements. We propose the Histogram AutoEncoder (HistoAE), an unsupervised representation learning network featuring a custom histogram-based loss that enforces a physically structured latent space. Applied to silicon microstrip detectors, HistoAE learns an interpretable two-dimensional latent space corresponding to the particle's charge and impact position. After simple post-processing, it achieves a charge resolution of and a position resolution of on beam-test data, comparable to the conventional approach. These results demonstrate that unsupervised deep learning models can enable physically meaningful and quantitatively precise measurements. Moreover, the generative capacity of HistoAE enables straightforward extensions to fast detector simulations.

Paper Structure

This paper contains 7 sections, 5 equations, 7 figures.

Figures (7)

  • Figure 1: Distribution of the maximum channel signal within a cluster as a function of the normalized impact position for mixed-nuclei events in the beam test. The x-axis represents the impact position normalized to the pitch between two readout strips, where a value of 0.5 denotes the midpoint between adjacent strips. The y-axis gives the signal amplitude, and the horizontal bands from bottom to top correspond to different nuclear species.
  • Figure 2: The network architecture of HistoAE. It consists of an encoder and a decoder connected through a two-dimensional latent space. A reconstruction loss ensures that the reconstructed clusters closely match the original ones, while the HistoLoss constrains the two latent dimensions to represent charge and position, respectively.
  • Figure 3: Latent-space visualizations. (a) and (b) show two-dimensional histograms of the latent variables obtained with the WAE and HistoAE, respectively. (c) and (d) present the corresponding latent spaces color-coded by the true nuclear charge, revealing overlapping diagonal bands for the WAE and well-separated horizontal parallel structures for the HistoAE.
  • Figure 4: Physical interpretation of the HistoAE latent space. (a) Projection onto the charge dimension, where each point is color-coded by the independently measured charge from the telescope, shows that each AE-derived charge peak corresponds to the correct nuclei. (b) Comparison between the position dimension and the telescope-predicted impact position, normalized to the interval between two strips, reveals two distinct linear segments.
  • Figure 5: Charge measurement using HistoAE. (a) Local peak finding to determine the AE charge value corresponding to each integer charge nucleus. (b) Linear correlation between the AE charge and the true integer charge. (c) Rescaled AE charge to the physical charge space through interpolation based on the linear relation, and each nucleus peak is fitted with a Gaussian function. (d) The Gaussian width represents the charge resolution for each nucleus obtained with HistoAE.
  • ...and 2 more figures