Table of Contents
Fetching ...

Unsupervised domain adaptation for radioisotope identification in gamma spectroscopy

Peter Lalor, Ayush Panigrahy, Alex Hagen

TL;DR

It is demonstrated that unsupervised domain adaptation (UDA) can improve the ability of a model trained on synthetic data to generalize to a new testing domain, provided unlabeled data from the target domain are available.

Abstract

Training machine learning models for radioisotope identification using gamma spectroscopy remains an elusive challenge for many practical applications, largely stemming from the difficulty of acquiring and labeling large, diverse experimental datasets. Simulations can mitigate this challenge, but the accuracy of models trained on simulated data can deteriorate substantially when deployed to an out-of-distribution operational environment. In this study, we demonstrate that unsupervised domain adaptation (UDA) can improve the ability of a model trained on synthetic data to generalize to a new testing domain, provided unlabeled data from the target domain are available. Conventional supervised techniques are unable to utilize this data because the absence of isotope labels precludes defining a supervised classification loss. Instead, we first pretrain a spectral classifier using labeled synthetic data and subsequently leverage unlabeled target data to align the learned feature representations between the source and target domains. We compare a range of different UDA techniques, finding that minimizing the maximum mean discrepancy (MMD) between source and target feature vectors yields the most consistent improvement to testing scores. For instance, using a custom transformer-based neural network, we achieved a testing accuracy of $0.904 \pm 0.022$ on an experimental LaBr$_3$ test set after performing unsupervised feature alignment via MMD minimization, compared to $0.754 \pm 0.014$ before alignment. Overall, our results highlight the potential of using UDA to adapt a radioisotope classifier trained on synthetic data for real-world deployment.

Unsupervised domain adaptation for radioisotope identification in gamma spectroscopy

TL;DR

It is demonstrated that unsupervised domain adaptation (UDA) can improve the ability of a model trained on synthetic data to generalize to a new testing domain, provided unlabeled data from the target domain are available.

Abstract

Training machine learning models for radioisotope identification using gamma spectroscopy remains an elusive challenge for many practical applications, largely stemming from the difficulty of acquiring and labeling large, diverse experimental datasets. Simulations can mitigate this challenge, but the accuracy of models trained on simulated data can deteriorate substantially when deployed to an out-of-distribution operational environment. In this study, we demonstrate that unsupervised domain adaptation (UDA) can improve the ability of a model trained on synthetic data to generalize to a new testing domain, provided unlabeled data from the target domain are available. Conventional supervised techniques are unable to utilize this data because the absence of isotope labels precludes defining a supervised classification loss. Instead, we first pretrain a spectral classifier using labeled synthetic data and subsequently leverage unlabeled target data to align the learned feature representations between the source and target domains. We compare a range of different UDA techniques, finding that minimizing the maximum mean discrepancy (MMD) between source and target feature vectors yields the most consistent improvement to testing scores. For instance, using a custom transformer-based neural network, we achieved a testing accuracy of on an experimental LaBr test set after performing unsupervised feature alignment via MMD minimization, compared to before alignment. Overall, our results highlight the potential of using UDA to adapt a radioisotope classifier trained on synthetic data for real-world deployment.
Paper Structure (25 sections, 19 equations, 3 figures, 14 tables)

This paper contains 25 sections, 19 equations, 3 figures, 14 tables.

Figures (3)

  • Figure 1: UMAP visualizations of the feature extractor output from 2000 source spectra (blue 'o' markers) and 2000 target spectra (red 'x' markers) using a TBNN-LinEmb architecture for the sim-to-sim domain adaptation scenario. When only training only on the source dataset (left), the target features show poor alignment with the source features. Using DAN (right), the extracted features show better alignment between the source and target domains.
  • Figure 2: UMAP visualizations of the feature extractor output from 2000 source spectra ('o' markers) and 2000 target spectra ('x' markers) using a TBNN-LinEmb architecture for the sim-to-real LaBr$_3$ domain adaptation scenario, with colormap indicating class. Surprisingly, we don't observe a notice a substantial qualitative difference in clustering quality between the source-only model (left) and the DAN model (right).
  • Figure 3: SHAP explanations for a $^{40}$K spectrum using a source-only model (left) and a DAN model (right). The source-only model only identifies the LaBr$_3$ detector's intrinsic 32 keV X-ray peak as salient, whereas the DAN model instead highlights the correct 1460 keV line.