Compact Representation of Particle-Collision Events for Physics-Informed Machine Learning
Wasikul Islam, Sergei Chekanov
TL;DR
This work tackles the challenge of high-dimensional collider event representations by introducing RMM-C46, a 46-zone, physics-driven compression of the rapidity–mass matrix (RMM). By aggregating the RMM into physically interpretable blocks and employing additive or Frobenius-norm zone aggregations, RMM-C46 preserves key kinematic structures while drastically reducing dimensionality, enabling efficient classical ML and facilitating near-term quantum hardware deployment. Empirical results on 13.6 TeV proton–proton MC samples show that RMM-C46 matches or slightly exceeds the full RMM in supervised tasks and significantly improves unsupervised anomaly detection performance, with Frobenius-based aggregation often yielding the best results. The approach offers a practical, quantum-ready representation for HL-LHC-era analyses, providing improved training efficiency, interpretability, and scalable integration with quantum–classical hybrid pipelines; code is publicly available in the GIT repository c46git.
Abstract
We introduce a compact, physics-driven event representation, RMM-C46, designed to compress the high-dimensional rapidity mass matrix (RMM) into a low-dimensional, interpretable feature set suitable for physics-informed machine learning (ML) and quantum computing applications. The full RMM encodes detailed pairwise correlations among jets, b-jets, leptons, photons, and missing transverse energy but contains more than a thousand values per event, making it computationally heavy for large-scale training and incompatible with current low-qubit quantum devices. The proposed RMM-C46 input space for ML preserves the physical block structure of the RMM through aggregated invariant mass, rapidity difference, and transverse energy components, reducing the size of the original RMM by over an order of magnitude while maintaining interpretability. Applied to simulated proton-proton collisions at centre-of-mass energy of 13.6 TeV, these representations match or exceed the discriminative performance of the full RMM in both supervised and unsupervised ML tasks. Their compactness, stability, and physics transparency also make them naturally compatible with near-term quantum machine learning architectures. RMM-C46 provides a scalable, efficient, and quantum-ready alternative to the full RMM for next-generation collider physics analyses.
