Table of Contents
Fetching ...

Enhancing low energy reconstruction and classification in KM3NeT/ORCA with transformers

Iván Mozún Mateo

TL;DR

This paper tackles the reconstruction of low-energy neutrino events in KM3NeT/ORCA by introducing a transformer model augmented with physics- and detector-informed attention masks. By encoding domain constraints in the attention mechanism, the approach improves direction and energy reconstruction at low energies compared to traditional maximum-likelihood fits and enables effective transfer learning across telescope configurations. The results show notable gains in AUROC with limited training data when leveraging pretraining on larger configurations, highlighting both improved performance and data efficiency for a detector still under construction. The proposed method offers practical benefits for neutrino oscillation studies and demonstrates how deep learning models can integrate physical knowledge to exploit the full potential of complex neutrino telescope data.

Abstract

The current KM3NeT/ORCA neutrino telescope, still under construction, has not yet reached its full potential in neutrino reconstruction capability. When training any deep learning model, no explicit information about the physics or the detector is provided, thus they remain unknown to the model. This study leverages the strengths of transformers by incorporating attention masks inspired by the physics and detector design, making the model understand both the telescope design and the neutrino physics measured on it. The study also shows the efficacy of transformers on retaining valuable information between detectors when doing fine-tuning from one configurations to another.

Enhancing low energy reconstruction and classification in KM3NeT/ORCA with transformers

TL;DR

This paper tackles the reconstruction of low-energy neutrino events in KM3NeT/ORCA by introducing a transformer model augmented with physics- and detector-informed attention masks. By encoding domain constraints in the attention mechanism, the approach improves direction and energy reconstruction at low energies compared to traditional maximum-likelihood fits and enables effective transfer learning across telescope configurations. The results show notable gains in AUROC with limited training data when leveraging pretraining on larger configurations, highlighting both improved performance and data efficiency for a detector still under construction. The proposed method offers practical benefits for neutrino oscillation studies and demonstrates how deep learning models can integrate physical knowledge to exploit the full potential of complex neutrino telescope data.

Abstract

The current KM3NeT/ORCA neutrino telescope, still under construction, has not yet reached its full potential in neutrino reconstruction capability. When training any deep learning model, no explicit information about the physics or the detector is provided, thus they remain unknown to the model. This study leverages the strengths of transformers by incorporating attention masks inspired by the physics and detector design, making the model understand both the telescope design and the neutrino physics measured on it. The study also shows the efficacy of transformers on retaining valuable information between detectors when doing fine-tuning from one configurations to another.

Paper Structure

This paper contains 6 sections, 2 equations, 3 figures.

Figures (3)

  • Figure 1: Example of a $\nu_\mu^{CC}$ event in KM3NeT/ORCA115: color dots represent the different light pulses in the PMTs and the heatmap is the arrival time w.r.t. to the mean time of arrival of all triggered hits in the event. Red solid line is the track $\mu$ produced after the neutrino interaction.
  • Figure 2: Evolution of the AUROC value for $\nu_\mu^{CC}$/$\nu_e^{CC}$ classification as a function of the training data size in ORCA6 for a model fine-tuned from ORCA115 (purple line) and a model trained from scratch (pink line). Modified from ricap.
  • Figure 3: Left: angular resolution in direction reconstruction for ORCA6 (dark purple) and ORCA115 (light purple). Right: $\log10$ of ratio between reconstructed and true energy. Comparison to a MLF reconstruction algorithm is also shown in orange color.