Deep-learning-based decomposition of overlapping-sparse images: application at the vertex of neutrino interactions

Saúl Alonso-Monsalve; Davide Sgalaberna; Xingyu Zhao; Adrien Molines; Clark McGrew; André Rubbia

Deep-learning-based decomposition of overlapping-sparse images: application at the vertex of neutrino interactions

Saúl Alonso-Monsalve, Davide Sgalaberna, Xingyu Zhao, Adrien Molines, Clark McGrew, André Rubbia

TL;DR

This work tackles the problem of decomposing overlapping-sparse detector images, specifically at the vertex of neutrino interactions, by combining a transformer-based decomposer with a differentiable GAN generator. The method iteratively identifies and reconstructs per-particle kinematics (KE and directions) while estimating the vertex position, and then refines these reconstructions via gradient-descent minimisation guided by a differentiable generator. Results on simulated CC0π-like events show high precision in kinetic-energy recovery, robust vertex localization (~2 mm), and substantial gains in visible-energy accuracy over standard unfolding, even when including complex nuclear clusters. The approach reduces model dependence in energy reconstruction, enables uncertainty quantification, and offers a pathway to improved neutrino oscillation measurements in future experiments. Overall, the combination of a decomposing transformer, a differentiable GAN, and gradient-based refinement demonstrates a powerful framework for disentangling overlapping-sparse images in particle physics and potentially beyond.

Abstract

Image decomposition plays a crucial role in various computer vision tasks, enabling the analysis and manipulation of visual content at a fundamental level. Overlapping images, which occur when multiple objects or scenes partially occlude each other, pose unique challenges for decomposition algorithms. The task intensifies when working with sparse images, where the scarcity of meaningful information complicates the precise extraction of components. This paper presents a solution that leverages the power of deep learning to accurately extract individual objects within multi-dimensional overlapping-sparse images, with a direct application in high-energy physics with decomposition of overlaid elementary particles obtained from imaging detectors. In particular, the proposed approach tackles a highly complex yet unsolved problem: identifying and measuring independent particles at the vertex of neutrino interactions, where one expects to observe detector images with multiple indiscernible overlapping charged particles. By decomposing the image of the detector activity at the vertex through deep learning, it is possible to infer the kinematic parameters of the identified low-momentum particles - which otherwise would remain neglected - and enhance the reconstructed energy resolution of the neutrino event. We also present an additional step - that can be tuned directly on detector data - combining the above method with a fully-differentiable generative model to improve the image decomposition further and, consequently, the resolution of the measured parameters, achieving unprecedented results. This improvement is crucial for precisely measuring the parameters that govern neutrino flavour oscillations and searching for asymmetries between matter and antimatter.

Deep-learning-based decomposition of overlapping-sparse images: application at the vertex of neutrino interactions

TL;DR

Abstract

Paper Structure (14 sections, 3 equations, 6 figures, 1 table, 2 algorithms)

This paper contains 14 sections, 3 equations, 6 figures, 1 table, 2 algorithms.

Introduction
Results
Initial case
Leveraging the generator for an enhanced decomposition
Nuclear clusters
Comparison with standard method
Discussion
Methods
Simulated datasets
Transformer for sparse-image decomposition
Generative adversarial network for fast elementary-particle simulations
Gradient-descent minimisation of the image parameters
Data availability
Code availability

Figures (6)

Figure 1: Pipeline for an example neutrino interaction. (1) Detection of the vertex activity (VA) region in the input event. The zoomed-in view unveils the particle mosaic at the interaction vertex. (2) Utilisation of the VA image, combined with the reconstructed kinematic parameters of the escaping muon, as input for the transformer encoder. The transformer encoder processes the input, resulting in (3) the reconstruction of the interaction vertex position and (4) an embedded representation of the VA event for the decoder. (5) The transformer decoder initially processes the encoder's information, resulting in a prediction for the kinematics of the most energetic particle in the input VA. Simultaneously, it generates a boolean variable to signify the existence of additional particles that require reconstruction. In cases where additional particles are identified, the transformer decoder proceeds to iteratively provide their kinematic predictions in descending order of the kinetic energies of the particles. This process continues by incorporating encoder data and the kinematics of the previously predicted particle until the boolean variable signals termination, indicating that the transformer has determined no further particles are present in the VA event. (6) A generative adversarial network (GAN) generator produces images of particles based on the kinematics predicted by the transformer. Using the initial reconstructed kinematics, the GAN is also employed to generate an image of the escaping muon. (7) The generated images are aggregated by summing their voxel photoelectrons, and (8) compared with the input VA event to verify the decomposition process. The workflow may return to step 6 to further optimise kinematic parameters.
Figure 2: Main results of the vertex-activity (VA) fitting algorithm on the testing dataset, consisting of events with one muon and one to five protons.$\mu$: mean, $\sigma$: standard deviation, false positives (negatives): over (under)-reconstructed particles (with default true (reconstructed) values: kinetic energy (KE) = 0 MeV; $\theta=$ 90 degrees; $\phi=$ 180 degrees). a Top: histogram of the total vs reconstructed KE for each event. The plot includes false positives and false negatives. Middle: KE resolution per particle relative to true KE and momentum (P) for each fitted particle. The plot excludes false positives and false negatives. Bottom: distribution of the difference between the reconstructed KE and the true KE per particle for the cases with 1-5 particles per event. The plot includes false positives and false negatives. b Left: confusion matrix showing the recall (normalisation by columns) and precision (normalisation by rows) of the true vs reconstructed number of particles. Right: distribution of the false positives (particles predicted by the algorithm that are not present in the input events) and false negatives (particles not predicted by the algorithm that are present in the input events). c Top: density plot relating the difference between reconstructed and true $\phi$ and $\theta$ (spherical coordinates) per particle. Bottom: density plot relating the difference between reconstructed and true angles and the particle length. Both plots include false positives and false negatives. d. Top: scatter plot comparing the true and reconstructed vertex per event for the x, y, and z coordinates. Bottom: box plot of the 3D Euclidean distance between the reconstructed and true vertices for the cases with 1-5 particles per event.
Figure 3: Kinematic parameter optimisation via gradient-descent minimisation.a Processing of a target event: The target event, along with its reconstructed muon kinematics, is input to the transformer (depicted in blue). The transformer generates a set of possible kinematic combinations for all particles within the target event. These kinematics are subsequently forwarded to the gradient-descent minimiser (depicted in red), which leverages the generative adversarial network (GAN) to refine the kinematics and improve the correspondence with the target event. The diagram visually represents the different decompositions resulting from this process. Kinematic parameters: kinetic energy (KE), direction in spherical coordinates ($\theta$ and $\phi$), interaction vertex 3D position (x, y, z). b The plot shows two scenarios in a likelihood space (pre-computed for the kinetic energy of two most energetic protons of an arbitrary event: KE$_{\text{proton}_{1}}$ and KE$_{\text{proton}_{2}}$). One starts from the transformer output and successfully reaches the target values, while the other begins at a random parameter space point and gets stuck in a local minimum. c Profiled negative log-likelihood $\mathcal{L}$ for the kinetic energy of the most energetic proton (KE$_{\text{proton}_{1}}$) of an arbitrary event, and the curve shows the 68% confidence interval determined by a $\Delta\mathcal{L}$ of 1 for one degree of freedom. d The resolution of kinetic energy (KE), as determined through an analysis of sets of random and "hard" events (i.e., events where the image reconstructed from the transformer exceeded a predefined mean-squared-error threshold in comparison to the target image), was assessed for three distinct methodologies: the transformer and two gradient-descent techniques ("GAN (gr. descent 1)" and "GAN (gr. descent)", as per Algorithms \ref{['alg:gradient_descent1']} and \ref{['alg:gradient_descent2']} from Section \ref{['sec:gradient_descent']}, respectively). It illustrates the effectiveness of GAN-based minimisation in refining the kinematic parameters.
Figure 4: Main results of the vertex-activity (VA) fitting algorithm on the testing dataset, consisting of events with one muon, 0-4 protons (p), 0-1 deuterium (D), and 0-1 tritium (T).$\mu$: mean, $\sigma$: standard deviation, false positives (negatives): over (under)-reconstructed particles (with default true (reconstructed) values: kinetic energy (KE) = 0 MeV; $\theta=$ 90 degrees; $\phi=$ 180 degrees). a Top: histogram of the total vs reconstructed KE for each event. The plot includes false positives and false negatives. Middle: KE resolution per particle relative to true KE and momentum (P). The plot excludes false positives and false negatives. Bottom: distribution of the difference between the reconstructed KE and the true KE per particle for the cases with 1-6 particles per event. The plot includes false positives and false negatives. b Left: confusion matrix showing the recall (normalisation by columns) and precision (normalisation by rows) of the true vs reconstructed number of particles. Right: distribution of the false positives (particles predicted by the algorithm that are not present in the input events) and false negatives (particles not predicted by the algorithm that are present in the input events). c Top: density plot relating the difference between reconstructed and true $\phi$ and $\theta$ (spherical coordinates) per particle. Bottom: density plot relating the difference between reconstructed and true angles and the particle length. Both plots include false positives and false negatives. d. Top: scatter plot comparing the true and reconstructed vertex per event for the x, y, and z coordinates. Bottom: box plot of the 3D Euclidean distance between the reconstructed and true vertices for the cases with 1-6 particles per event. e Left: confusion matrix showing the recall (normalisation by columns) and precision (normalisation by rows) of the particle identification. Middle: particle-identification probability distributions when the true particle is a proton. Right: particle-identification probability distributions when the true particle is a nuclear cluster (either deuterium or tritium).
Figure 5: Architecture and training curves of the decomposing transformer.a Left to right: at training time, a variable number of particles with similar (within $\sim$0.2 mm) initial positions are selected from the dataset and combined to create an event, and multiple events form a training batch. The input voxel data (comprising energy loss and spatial coordinates) is subsequently processed and passed through the decomposing transformer. This transformer initially generates a prediction for vertex positions and subsequently provides estimates for kinematic parameters and termination conditions for each particle in the events. The diagram illustrates the evolving tensor shapes at various stages of the processing. b Training and validation curves for the different outputs of the network, showing a smooth convergence of the model (this plot corresponds to the model from Section \ref{['sec:initial_case']}. Similar curves are observed for the model from Section \ref{['sec:nuclear_clusters']}). The learning rate schedule used is appreciated by looking at the dashed purple lines. The parentheses indicate the tensor dimensions at each stage.
...and 1 more figures

Deep-learning-based decomposition of overlapping-sparse images: application at the vertex of neutrino interactions

TL;DR

Abstract

Deep-learning-based decomposition of overlapping-sparse images: application at the vertex of neutrino interactions

Authors

TL;DR

Abstract

Table of Contents

Figures (6)