An interpretable unsupervised representation learning for high precision measurement in particle physics
Xing-Jian Lv, De-Xing Miao, Zi-Jun Xu, Jian-Chun Wang
TL;DR
This work tackles the interpretability gap in unsupervised representations for precision measurements in particle physics. It introduces the Histogram AutoEncoder (HistoAE), which uses a histogram-based loss (HistoLoss) to enforce a 2D latent space with clear physical meaning: charge and impact position. Applied to silicon microstrip detector beam-test data, HistoAE achieves charge resolution around 0.25 e and position resolution near 3 μm, matching conventional methods while operating in an entirely unsupervised framework. Additionally, the decoder serves as a fast generator for detector responses, enabling rapid simulations and future extensions to more complex latent spaces.
Abstract
Unsupervised learning has been widely applied to various tasks in particle physics. However, existing models lack precise control over their learned representations, limiting physical interpretability and hindering their use for accurate measurements. We propose the Histogram AutoEncoder (HistoAE), an unsupervised representation learning network featuring a custom histogram-based loss that enforces a physically structured latent space. Applied to silicon microstrip detectors, HistoAE learns an interpretable two-dimensional latent space corresponding to the particle's charge and impact position. After simple post-processing, it achieves a charge resolution of $0.25\,e$ and a position resolution of $3\,μ\mathrm{m}$ on beam-test data, comparable to the conventional approach. These results demonstrate that unsupervised deep learning models can enable physically meaningful and quantitatively precise measurements. Moreover, the generative capacity of HistoAE enables straightforward extensions to fast detector simulations.
