Table of Contents
Fetching ...

Universal New Physics Latent Space

Anna Hallin, Gregor Kasieczka, Sabine Kraml, André Lessa, Louis Moureaux, Tore von Schwartz, David Shih

TL;DR

The paper targets the challenge of vast Beyond the Standard Model theory space by learning a universal, two‑dimensional latent space that maps SM and diverse BSM predictions while preserving inter‑model relationships. Using a fully connected encoder trained with a contrastive loss on event‑level observables (MET and jet kinematics) across multiple datasets, it reveals that models cluster by LHC phenomenology, with distances in latent space reflecting discriminability and key physical differences such as $\Delta m$ and mediator type. Across MSSM gluino, dark‑matter mediator, and Dark Machines anomaly datasets, the latent embeddings systematically separate models into phenomenology‑driven regions and highlight observables that drive discrimination, enabling reduced benchmark sets and identification of coverage gaps. The approach promises improved reinterpretation, model‑space coverage assessment, and targeted exploration for future LHC searches, with future work aimed at incorporating cross‑sections, richer feature sets, and latent‑space sampling to reconstruct physical observables. $2$‑D latent space visualizations and distance relationships provide a practical framework for navigating the BSM landscape at colliders.

Abstract

We develop a machine learning method for mapping data originating from both Standard Model processes and various theories beyond the Standard Model into a unified representation (latent) space while conserving information about the relationship between the underlying theories. We apply our method to three examples of new physics at the LHC of increasing complexity, showing that models can be clustered according to their LHC phenomenology: different models are mapped to distinct regions in latent space, while indistinguishable models are mapped to the same region. This opens interesting new avenues on several fronts, such as model discrimination, selection of representative benchmark scenarios, and identifying gaps in the coverage of model space.

Universal New Physics Latent Space

TL;DR

The paper targets the challenge of vast Beyond the Standard Model theory space by learning a universal, two‑dimensional latent space that maps SM and diverse BSM predictions while preserving inter‑model relationships. Using a fully connected encoder trained with a contrastive loss on event‑level observables (MET and jet kinematics) across multiple datasets, it reveals that models cluster by LHC phenomenology, with distances in latent space reflecting discriminability and key physical differences such as and mediator type. Across MSSM gluino, dark‑matter mediator, and Dark Machines anomaly datasets, the latent embeddings systematically separate models into phenomenology‑driven regions and highlight observables that drive discrimination, enabling reduced benchmark sets and identification of coverage gaps. The approach promises improved reinterpretation, model‑space coverage assessment, and targeted exploration for future LHC searches, with future work aimed at incorporating cross‑sections, richer feature sets, and latent‑space sampling to reconstruct physical observables. ‑D latent space visualizations and distance relationships provide a practical framework for navigating the BSM landscape at colliders.

Abstract

We develop a machine learning method for mapping data originating from both Standard Model processes and various theories beyond the Standard Model into a unified representation (latent) space while conserving information about the relationship between the underlying theories. We apply our method to three examples of new physics at the LHC of increasing complexity, showing that models can be clustered according to their LHC phenomenology: different models are mapped to distinct regions in latent space, while indistinguishable models are mapped to the same region. This opens interesting new avenues on several fronts, such as model discrimination, selection of representative benchmark scenarios, and identifying gaps in the coverage of model space.
Paper Structure (16 sections, 6 equations, 17 figures, 1 table)

This paper contains 16 sections, 6 equations, 17 figures, 1 table.

Figures (17)

  • Figure 1: Process simulated for the MSSM gluino dataset.
  • Figure 2: Distributions of the leading jet $p_T$, plus MET and $m^2_{1,2}$, for each of the mass configurations in the MSSM gluino dataset.
  • Figure 3: Resulting latent space for the MSSM gluino dataset, showing contours corresponding to a CDF value of 0.5 for each model. The different contours are colored based on the mass difference $\Delta m=m_{\tilde{g}}-m_{\tilde{\chi}^0_1}$, which is also given as the contour label (in TeV); dark red: $\Delta m=2$ TeV; red: $\Delta m=1.6$ (dashed) and $1.5$ (solid) TeV; green: $\Delta m=1.2$ (dash-dot), $1.1$ (dashed) and $1.0$ (solid) TeV; cyan: $\Delta m=0.7$ (dash-dot) and $0.6$ (dashed) TeV; blue: $\Delta m=0.2$ TeV. The linestyles are chosen based on the gluino mass: solid for $m_{\tilde{g}}=2.1$ TeV, dashed for $m_{\tilde{g}}=1.6$ TeV, and dotted for $m_{\tilde{g}}=1.1$ TeV.
  • Figure 4: Examples of diagrams for DM production in the: a) vector mediator, b) pseudoscalar mediator and c) squark mediator models.
  • Figure 5: MET distribution of the three different mediator types with $m_{\rm DM}=100$ GeV and the highest and lowest of the mediator masses. It is clearly visible that while the peak of the distribution stays in the same place for the vector and pseudoscalar mediators, for the squark mediator it gets shifted with higher mediator mass.
  • ...and 12 more figures