Table of Contents
Fetching ...

Modeling X-ray photon pile-up with a normalizing flow

Ole König, Daniela Huppenkothen, Douglas Finkbeiner, Christian Kirsch, Jörn Wilms, Justina R. Yang, James F. Steiner, Juan Rafael Martínez-Galarza

TL;DR

The paper addresses the challenge of pile-up in X-ray CCD detectors, which biases spectral inferences or leads to data discard. It proposes a simulation-based inference pipeline that uses a CNN to encode four annulus spectra and a neural spline normalizing flow to output posterior distributions for flux, temperature, and absorption from piled-up data. Using 40,000 SIXTE-simulated piled-up eROSITA observations based on an absorbed blackbody, the method yields tighter, well-calibrated posteriors than traditional PSF-core-excision MCMC, with mean absolute percentage error below 10%. This approach enables exploitation of the archival eROSITA data for population analyses and bright-source studies and can be extended to a broader library of spectral models.

Abstract

The dynamic range of imaging detectors flown on-board X-ray observatories often only covers a limited flux range of extrasolar X-ray sources. The analysis of bright X-ray sources is complicated by so-called pile-up, which results from high incident photon flux. This nonlinear effect distorts the measured spectrum, resulting in biases in the inferred physical parameters, and can even lead to a complete signal loss in extreme cases. Piled-up data are commonly discarded due to resulting intractability of the likelihood. As a result, a large number of archival observations remain underexplored. We present a machine learning solution to this problem, using a simulation-based inference framework that allows us to estimate posterior distributions of physical source parameters from piled-up eROSITA data. We show that a normalizing flow produces better-constrained posterior densities than traditional mitigation techniques, as more data can be leveraged. We consider model- and calibration-dependent uncertainties and the applicability of such an algorithm to real data in the eROSITA archive.

Modeling X-ray photon pile-up with a normalizing flow

TL;DR

The paper addresses the challenge of pile-up in X-ray CCD detectors, which biases spectral inferences or leads to data discard. It proposes a simulation-based inference pipeline that uses a CNN to encode four annulus spectra and a neural spline normalizing flow to output posterior distributions for flux, temperature, and absorption from piled-up data. Using 40,000 SIXTE-simulated piled-up eROSITA observations based on an absorbed blackbody, the method yields tighter, well-calibrated posteriors than traditional PSF-core-excision MCMC, with mean absolute percentage error below 10%. This approach enables exploitation of the archival eROSITA data for population analyses and bright-source studies and can be extended to a broader library of spectral models.

Abstract

The dynamic range of imaging detectors flown on-board X-ray observatories often only covers a limited flux range of extrasolar X-ray sources. The analysis of bright X-ray sources is complicated by so-called pile-up, which results from high incident photon flux. This nonlinear effect distorts the measured spectrum, resulting in biases in the inferred physical parameters, and can even lead to a complete signal loss in extreme cases. Piled-up data are commonly discarded due to resulting intractability of the likelihood. As a result, a large number of archival observations remain underexplored. We present a machine learning solution to this problem, using a simulation-based inference framework that allows us to estimate posterior distributions of physical source parameters from piled-up eROSITA data. We show that a normalizing flow produces better-constrained posterior densities than traditional mitigation techniques, as more data can be leveraged. We consider model- and calibration-dependent uncertainties and the applicability of such an algorithm to real data in the eROSITA archive.

Paper Structure

This paper contains 11 sections, 6 figures, 1 table.

Figures (6)

  • Figure 1: Illustration of X-ray photon pile-up in CCD detectors Schmid2012PhDT. Left: A double event can be created either from one photon that lands near the border of the pixel (low photon flux) or two photons impacting both pixels (pattern pile-up). In the latter case, the event contains signal from the sum of the individual photons impacts (energy pile-up). Right: Likewise, the same recorded single event can be created from one or two photons impacting the same pixel.
  • Figure 2: Example of a simulated piled-up eROSITA observation in the training dataset, representing an absorbed blackbody source with temperature of 130 eV and absorption of $1.3\times 10^{22}\,\mathrm{cm}^{-2}$. Spectra of four annuli are used as input to the neural network. At a flux of $6.9\times 10^{-9}\,\mathrm{erg}\,\mathrm{cm}^{-2}\,\mathrm{s}^{-1}$, considerable pattern pile-up is present, clearly seen as a depression in brightness at the center of the image (blue arrow), as well as energy pile-up, which manifests as an artificial high-energy bump (blue and orange spectrum around channel 200). The outer annuli are much less affected by pile-up, as the photons are distributed over more pixels in the lower-intensity outer wing of the PSF.
  • Figure 3: Posterior distributions for two examples from the test dataset. a) At a flux of $1.2\times 10^{-10}\,\mathrm{erg}\,\mathrm{cm}^{-2}\,\mathrm{s}^{-1}$, significant pile-up is present, and only data after excising the PSF core can be used in a traditional MCMC analysis. Red data shows the MCMC posterior from a spectrum extracted from an outer annulus with radii 120"--240" (351 counts). Black lines depict the NF posteriors, created by sampling 10 000 points from the distribution. Blue square denotes the ground truth. The flow posterior is more constraining than the MCMC because all data from the source region can be leveraged. b) In the non-pile up regime, at $2\times 10^{-12}\,\mathrm{erg}\,\mathrm{cm}^{-2}\,\mathrm{s}^{-1}$, the MCMC (green) can take data from the whole source region into account (233 counts) and serve as a baseline distribution. The NF posterior is similarly sized and shaped. Contour lines denote 0.5, 1, 1.5, and $2\sigma$ confidence intervals (for a 2D Gaussian distribution).
  • Figure 4: Coverage plots for the three parameters of the normalizing flow. Thick lines indicates the coverage across the full test dataset (6000 sets of spectra). Thin lines denote specific ranges in the parameter space. For a perfectly-calibrated NF, the lines should converge to the bisection (gray dashed line). Regions above/below the bisection indicate under-/overconfidence, respectively.
  • Figure B.1: Direct parameter reconstruction with the CNN without the normalizing flow. In this test case, we add one final fully-connected output layer of dimension three. Thus, the network directly predicts the three physical parameters. The output of all examples in the test dataset are shown. Red dashed line shows the rough threshold where eROSITA data becomes piled-up Merloni2024a, although we caution that the effect is gradual.
  • ...and 1 more figures