Table of Contents
Fetching ...

TRANSIT your events into a new mass: Fast background interpolation for weakly-supervised anomaly searches

Ivan Oleksiyuk, Svyatoslav Voloshynovskiy, Tobias Golling

TL;DR

TRANSIT addresses the need for fast, data-driven background templates in weakly-supervised LHC anomaly searches by learning a smooth, conditional transport from sidebands to the signal region. It deploys a residual encoder–decoder (TM) with reconstruction, transport, and consistency losses to decorrelate mass dependence in the latent space while preserving mass-dependent structure through the decoder, achieving competitive anomaly sensitivity with substantially reduced training time. The method outperforms several transport-based deep-learning baselines (e.g., RAD-OT, CURTAINSF4F) and offers a robust latent representation (LaTRANSIT) that mitigates background sculpting, enabling high-rejection analyses. Its fast, scalable, data-driven template generation has practical impact for multi-region, multi-model anomaly searches, and code is publicly available for broader adoption.

Abstract

We introduce a new model for conditional and continuous data morphing called TRansport Adversarial Network for Smooth InTerpolation (TRANSIT). We apply it to create a background data template for weakly-supervised searches at the LHC. The method smoothly transforms sideband events to match signal region mass distributions. We demonstrate the performance of TRANSIT using the LHC Olympics R\&D dataset. The model captures non-linear mass correlations of features and produces a template that offers a competitive anomaly sensitivity compared to state-of-the-art transport-based template generators. Moreover, the computational training time required for TRANSIT is an order of magnitude lower than that of competing deep learning methods. This makes it ideal for analyses that iterate over many signal regions and signal models. Unlike generative models, which must learn a full probability density distribution, i.e., the correlations between all the variables, the proposed transport model only has to learn a smooth conditional shift of the distribution. This allows for a simpler, more efficient residual architecture, enabling mass uncorrelated features to pass the network unchanged while the mass correlated features are adjusted accordingly. Furthermore, we show that the latent space of the model provides a set of mass decorrelated features useful for anomaly detection without background sculpting.

TRANSIT your events into a new mass: Fast background interpolation for weakly-supervised anomaly searches

TL;DR

TRANSIT addresses the need for fast, data-driven background templates in weakly-supervised LHC anomaly searches by learning a smooth, conditional transport from sidebands to the signal region. It deploys a residual encoder–decoder (TM) with reconstruction, transport, and consistency losses to decorrelate mass dependence in the latent space while preserving mass-dependent structure through the decoder, achieving competitive anomaly sensitivity with substantially reduced training time. The method outperforms several transport-based deep-learning baselines (e.g., RAD-OT, CURTAINSF4F) and offers a robust latent representation (LaTRANSIT) that mitigates background sculpting, enabling high-rejection analyses. Its fast, scalable, data-driven template generation has practical impact for multi-region, multi-model anomaly searches, and code is publicly available for broader adoption.

Abstract

We introduce a new model for conditional and continuous data morphing called TRansport Adversarial Network for Smooth InTerpolation (TRANSIT). We apply it to create a background data template for weakly-supervised searches at the LHC. The method smoothly transforms sideband events to match signal region mass distributions. We demonstrate the performance of TRANSIT using the LHC Olympics R\&D dataset. The model captures non-linear mass correlations of features and produces a template that offers a competitive anomaly sensitivity compared to state-of-the-art transport-based template generators. Moreover, the computational training time required for TRANSIT is an order of magnitude lower than that of competing deep learning methods. This makes it ideal for analyses that iterate over many signal regions and signal models. Unlike generative models, which must learn a full probability density distribution, i.e., the correlations between all the variables, the proposed transport model only has to learn a smooth conditional shift of the distribution. This allows for a simpler, more efficient residual architecture, enabling mass uncorrelated features to pass the network unchanged while the mass correlated features are adjusted accordingly. Furthermore, we show that the latent space of the model provides a set of mass decorrelated features useful for anomaly detection without background sculpting.

Paper Structure

This paper contains 25 sections, 23 equations, 15 figures, 1 table.

Figures (15)

  • Figure 1: Distributions of high-level observables commonly used in weakly supervised searches within the LHCO R&D dataset, presented for the QCD background and $Z'$ signal.
  • Figure 2: The transport of event form mass $m$ into mass $\hat{m}$. Left: Passage of data through the TRANSIT model with all the main components and losses. Right: The principle of transporting original events in sidebands (green crosses), corresponding to the original mass $m$, along the transport curves (dotted lines) to transformed events (red circles) corresponding to a new mass $\hat{m}$ in the signal region. $\sigma_{perm}$ denotes an operation of random permutation of the batch.
  • Figure 3: Architecture of the encoder (light-blue) and decoder (light-green) networks in TRANSIT.
  • Figure 4: Distributions of five observables for the SR, SB, and a TRANSIT template created by transporting SB events into SR masses. Pull plots illustrate the difference between the SR distribution and the other distributions, expressed in units of the Poisson standard deviation for each bin.
  • Figure 5: ROC curves for a BDT trained to discriminate TRANSIT templates from background SR data and for a BDT trained to discriminate SB latent representations from background SR latent representations in LaTRANSIT. Solid lines and filled regions represent the average and the standard deviation range across 6 TRANSIT network trainings with different initialisation seeds. No signal was added in these runs.
  • ...and 10 more figures