Table of Contents
Fetching ...

Improving the performance of weak supervision searches using transfer and meta-learning

Hugues Beauchesne, Zong-En Chen, Cheng-Wei Chiang

TL;DR

Weak supervision searches in collider physics are limited by the need for sizable signal to train neural networks. The authors address this bottleneck by combining transfer learning, via pretraining on simulations, with meta-learning to create fast-learning networks that adapt with less experimental data. Using dark-shower signals generated with the Pythia Hidden Valley module and a CNN-based CWoLa framework, they show that transfer learning substantially lowers the signal required for discovery (often by a factor of a few to reach $5\sigma$) and that meta-transfer learning provides additional gains. This work offers a practical proof-of-principle for making weak supervision more robust to limited signal and points to future work on model choices and systematic uncertainties.

Abstract

Weak supervision searches have in principle the advantages of both being able to train on experimental data and being able to learn distinctive signal properties. However, the practical applicability of such searches is limited by the fact that successfully training a neural network via weak supervision can require a large amount of signal. In this work, we seek to create neural networks that can learn from less experimental signal by using transfer and meta-learning. The general idea is to first train a neural network on simulations, thereby learning concepts that can be reused or becoming a more efficient learner. The neural network would then be trained on experimental data and should require less signal because of its previous training. We find that transfer and meta-learning can substantially improve the performance of weak supervision searches.

Improving the performance of weak supervision searches using transfer and meta-learning

TL;DR

Weak supervision searches in collider physics are limited by the need for sizable signal to train neural networks. The authors address this bottleneck by combining transfer learning, via pretraining on simulations, with meta-learning to create fast-learning networks that adapt with less experimental data. Using dark-shower signals generated with the Pythia Hidden Valley module and a CNN-based CWoLa framework, they show that transfer learning substantially lowers the signal required for discovery (often by a factor of a few to reach ) and that meta-transfer learning provides additional gains. This work offers a practical proof-of-principle for making weak supervision more robust to limited signal and points to future work on model choices and systematic uncertainties.

Abstract

Weak supervision searches have in principle the advantages of both being able to train on experimental data and being able to learn distinctive signal properties. However, the practical applicability of such searches is limited by the fact that successfully training a neural network via weak supervision can require a large amount of signal. In this work, we seek to create neural networks that can learn from less experimental signal by using transfer and meta-learning. The general idea is to first train a neural network on simulations, thereby learning concepts that can be reused or becoming a more efficient learner. The neural network would then be trained on experimental data and should require less signal because of its previous training. We find that transfer and meta-learning can substantially improve the performance of weak supervision searches.
Paper Structure (7 sections, 6 equations, 7 figures, 2 tables)

This paper contains 7 sections, 6 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Dijet invariant mass distributions for the indirect decaying scenario with $\Lambda_D=10$ GeV and for the SM background. Distributions are normalized to unity. Both signal and background satisfy the selection criteria of Table \ref{['tab:madgraph-macro']} except for the SR or SB conditions.
  • Figure 2: (a) A 2D $P_T$ histogram for one signal event in the SR before rotation and flipping. (b) A 2D $P_T$ histogram of the same event after complete preprocessing. (c) The average histogram for 10k background events in the SR after preprocessing. (d) The average histogram for 10k signal events in the SR after preprocessing. These plots are for the leading jet with $75 \times 75$ resolution and the ID scenario with $\Lambda_D$= 10 GeV.
  • Figure 3: The results of CNN CWoLa for the ID (left column) and DD (right column) scenarios with $\Lambda_D$= 10 GeV for $25\times25$, $50\times50$ and $75\times75$ resolutions. The dotted line in each plot has a slope of 1.
  • Figure 4: The results of transfer learning (solid curves) and of CWoLa (dashed curves, same as those in Fig. \ref{['fig:result_CNN_CWoLa']}) for the ID (left column) and DD (right column) scenarios with $\Lambda_D$= 10 GeV for $25\times25$, $50\times50$ and $75\times75$ resolutions. The dotted line in each plot has a slope of 1.
  • Figure 5: The results of meta-transfer learning (solid curves) and transfer learning (dashed curves, same as those in Fig. \ref{['fig:TL_CNN']}) for the ID (left column) and DD (right column) scenarios with $\Lambda_D$= 10 GeV for $25\times25$ and $50\times50$ resolutions. The dotted line in each plot has a slope of 1.
  • ...and 2 more figures