Table of Contents
Fetching ...

Binary Classification of Light and Dark Time Traces of a Transition Edge Sensor Using Convolutional Neural Networks

Elmeri Rivasto, Katharina-Sophie Isleif, Friederike Januschek, Axel Lindner, Manuel Meyer, Gulden Othman, José Alejandro Rubiera Gimeno, Christina Schwemmbauer

TL;DR

This work evaluates CNN-based binary classification of univariate TES time traces to separate 1064 nm light pulses from dark backgrounds in the ALPS II context. Despite careful architectural design and hyperparameter tuning, the CNN ensemble under current extrinsic backgrounds underperforms the traditional cut-based analysis, with training confusion arising from near-1064 nm black-body photons distorting the dark training set. Relabeling a subset of misclassified dark pulses as light improves metrics but does not surpass the cut-based approach, pointing to label noise as a fundamental limitation. The authors propose regression-based, unsupervised, and hardware-filtering strategies, including cryogenic optical filtering, to achieve the target background level of $10^{-5}$ Hz and enable robust TES energy-resolved detection for ALPS II and similar univariate time-trace detectors.

Abstract

The Any Light Particle Search II (ALPS II) is a light shining through a wall experiment probing the existence of axions and axion-like particles using a 1064 nm laser source. While ALPS II is already taking data using a heterodyne based detection scheme, cryogenic transition edge sensor (TES) based single-photon detectors are planned to expand the detection system for cross-checking the potential signals, for which a sensitivity on the order of $10^{-24}$ W is required. In order to reach this goal, we have investigated the use of convolutional neural networks (CNN) as binary classifiers to distinguish the experimentally measured 1064 nm photon triggered (light) pulses from background (dark) pulses. Despite extensive hyperparameter optimization, the CNN based binary classifier did not outperform our previously optimized cut-based analysis in terms of detection significance. This suggests that the used approach is not generally suitable for background suppression and improving the energy resolution of the TES. We partly attribute this to the training confusion induced by near-1064 nm black-body photon triggers in the background, which we identified as the limiting background source as concluded in our previous works. However, we argue that the problem ultimately lies in the binary classification based approach and believe that regression models would be better suitable for addressing the energy resolution. Unsupervised machine learning models, in particular neural network based autoencoders, should also be considered potential candidates for the suppression of noise in time traces. While the presented results and associated conclusions are obtained for TES designed to be used in the ALPS II experiment, they should hold equivalently well for any device whose output signal can be considered as a univariate time trace.

Binary Classification of Light and Dark Time Traces of a Transition Edge Sensor Using Convolutional Neural Networks

TL;DR

This work evaluates CNN-based binary classification of univariate TES time traces to separate 1064 nm light pulses from dark backgrounds in the ALPS II context. Despite careful architectural design and hyperparameter tuning, the CNN ensemble under current extrinsic backgrounds underperforms the traditional cut-based analysis, with training confusion arising from near-1064 nm black-body photons distorting the dark training set. Relabeling a subset of misclassified dark pulses as light improves metrics but does not surpass the cut-based approach, pointing to label noise as a fundamental limitation. The authors propose regression-based, unsupervised, and hardware-filtering strategies, including cryogenic optical filtering, to achieve the target background level of Hz and enable robust TES energy-resolved detection for ALPS II and similar univariate time-trace detectors.

Abstract

The Any Light Particle Search II (ALPS II) is a light shining through a wall experiment probing the existence of axions and axion-like particles using a 1064 nm laser source. While ALPS II is already taking data using a heterodyne based detection scheme, cryogenic transition edge sensor (TES) based single-photon detectors are planned to expand the detection system for cross-checking the potential signals, for which a sensitivity on the order of W is required. In order to reach this goal, we have investigated the use of convolutional neural networks (CNN) as binary classifiers to distinguish the experimentally measured 1064 nm photon triggered (light) pulses from background (dark) pulses. Despite extensive hyperparameter optimization, the CNN based binary classifier did not outperform our previously optimized cut-based analysis in terms of detection significance. This suggests that the used approach is not generally suitable for background suppression and improving the energy resolution of the TES. We partly attribute this to the training confusion induced by near-1064 nm black-body photon triggers in the background, which we identified as the limiting background source as concluded in our previous works. However, we argue that the problem ultimately lies in the binary classification based approach and believe that regression models would be better suitable for addressing the energy resolution. Unsupervised machine learning models, in particular neural network based autoencoders, should also be considered potential candidates for the suppression of noise in time traces. While the presented results and associated conclusions are obtained for TES designed to be used in the ALPS II experiment, they should hold equivalently well for any device whose output signal can be considered as a univariate time trace.

Paper Structure

This paper contains 12 sections, 4 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: (a) A circuit diagram of the experimental TES setup. Applying the constant bias voltage ($V_\mathrm{bias}$) and the shunt resistor ($R_\mathrm{shunt}$) parallel to the TES allows to set the TES working point (negative electrothermal feedback). (b) A schematic illustration of the layer structure of the TES optimized for detecting 1064 nm photons, where the 20 nm thick superconducting W layer with a surface area of $25\,\mathrm{\mu m}\times25\,\mathrm{\mu m}$ acts as the active material. (c) Picture of the used TES+SQUID module, including two TESs and their associated SQUIDs.
  • Figure 2: (a) The average measured light pulse and dark pulses, where the shaded regions represent the associated standard deviations. The inset presents a randomly chosen light pulse as an example of the signal and noise shapes. (b) Principal Component Analysis (PCA) scatter plot showing the projection of pulse feature vectors ($\tau_\mathrm{rise}$, $\tau_\mathrm{decay}$, $\chi_\mathrm{ph}^2$, $\mathrm{V}_\mathrm{min,\,FFT}$, $\chi^2_\mathrm{FFT}$) onto the first two principal components (PC$_1$ and PC$_2$). The inset shows a close-up of the cluster associated with light pulses, showing overlap with some of the dark pulses measured in extrinsic background.
  • Figure 3: A schematic illustration of the basic architecture of the considered CNN and its hyperparameters whose optimization is explicitly addressed (see Table \ref{['HP-opt_summary_table']}).
  • Figure 4: A schematic illustration of the division of the dataset into training and testing data. The training set was further divided 80%-20% into training and validation sets, where the validation set was to evaluate the performance of the CNN during the training process.
  • Figure 5: The evaluated average $S$ scores as a function of number of trainable parameters for the associated CNN. The error bars correspond to standard deviations associated with 5 evaluations of the CNN with different training and testing sets. The dashed vertical line points to the limit $S=1$ above which the points are colored by red (4.1%) and below as blue (95.9%).
  • ...and 5 more figures