Training toward significance with the decorrelated event classifier transformer neural network
Jaebak Kim
TL;DR
The paper addresses improving resonance-search sensitivity by binning events with a transformer-based event classifier while controlling correlations with the reconstructed mass. It introduces an event classifier transformer architecture and three targeted training techniques: extreme loss (mass decorrelation with enhanced significance), DisCo regularization, and data-scope training, plus significance-based epoch selection. In a simplified $H\rightarrow Z(\ell\ell)\gamma$ study, the transformer with these techniques yields the highest expected significance and the lowest mass correlation, outperforming boosted decision trees and feed-forward networks. The work demonstrates a practical approach to decorrelated, significance-optimized bump-hunt analyses with clear methodological and implementation details for reproducibility and SEO.
Abstract
Experimental particle physics uses machine learning for many tasks, where one application is to classify signal and background events. This classification can be used to bin an analysis region to enhance the expected significance for a mass resonance search. In natural language processing, one of the leading neural network architectures is the transformer. In this work, an event classifier transformer is proposed to bin an analysis region, in which the network is trained with special techniques. The techniques developed here can enhance the significance and reduce the correlation between the network's output and the reconstructed mass. It is found that this trained network can perform better than boosted decision trees and feed-forward networks.
