Table of Contents
Fetching ...

The Challenge in Illuminating the Invisible: Constraining LyC Escape with Bayesian Modelling and Symbolic Regression

Amanda Stoffers, Sandro Tacchella, Charlotte Simmonds, Benjamin D. Johnson, Roberto Maiolino

TL;DR

The paper addresses the indirect challenge of constraining LyC escape during the Epoch of Reionization by applying Bayesian SED fitting with Prospector to local LyC-leaking analogues (LzLCS). It systematically tests multiple prior and dust-attenuation configurations, demonstrates robust recovery of $f_{ m esc}^{ m LyC}$ in most cases, and derives a median escape fraction around 1% with some systems up to 70%, revealing that extreme LyC leakage does not always coincide with extreme global stellar properties. A symbolic regression analysis calibrated on synthetic Prospector data yields a concise relation $\log_{10}(f_{ m esc}) = -3.59\beta - 9.45 \pm 0.30$ that captures $f_{ m esc}^{ m LyC}$ within uncertainties for the LzLCS subset, offering a practical estimator when full SED fitting is impractical. The study highlights both the potential of Bayesian SED modelling to constrain LyC leakage and the limitations imposed by nebular emission modelling and ISM geometry, while framing local analogues as a valuable bridge to understanding reionization-era galaxies. Overall, the work advances LyC diagnostics by combining rigorous inference with data-driven regression, informing how indirect tracers can be calibrated and applied in high-redshift contexts.

Abstract

Direct observations of Lyman continuum (LyC) radiation from galaxies during the Epoch of Reionization (EoR) are impeded by absorption in the intergalactic medium, requiring indirect methods to infer the escape fraction of ionizing photons ($f_{\rm esc}^{\rm LyC}$). One approach is to develop and validate such methods on local analogues of the high-redshift galaxies with directly detected LyC leakage. In this work, we constrain $f_{\rm esc}^{\rm LyC}$ using a Bayesian spectral energy distribution (SED) fitting framework built on Prospector, which incorporates a non-parametric star-formation history, a flexible dust attenuation curve, self-consistent nebular emission, and fiber aperture-loss corrections. Our methodology jointly fits broadband photometry and emission line fluxes. We apply six models to the Low-redshift LyC Survey (LzLCS), a sample of local galaxies with properties comparable to EoR galaxies, and evaluate them based on their ability to recover the observed LyC flux and their relative Bayesian evidence. The best-performing model is further assessed through a parameter recovery test, demonstrating that $f_{\rm esc}^{\rm LyC}$can be recovered within uncertainties. Building on these results, we present updated $f_{\rm esc}^{\rm LyC}$ estimates for the LzLCS sample, with a median of 0.3%, corresponding to very low leakage, and values reaching as high as 70%, with six of 64 galaxies having a cosmologically relevant $f_{\rm esc}^{\rm LyC}$ ($>5%$). Additionally, we present a revised UV $β$-slope vs $\log_{10}(f_\mathrm{esc}^\mathrm{LyC})$ relation, derived using symbolic regression with PySR trained on a synthetic dataset generated from our best-performing model: $\log_{10}(f_{\rm esc}^{\rm LyC}) = (-3.59β- 9.45) \, \pm \, 0.30$. The relation successfully reproduces the $f_{\rm esc}^{\rm LyC}$ obtained from full SED fitting of the LzLCS sample within uncertainties.

The Challenge in Illuminating the Invisible: Constraining LyC Escape with Bayesian Modelling and Symbolic Regression

TL;DR

The paper addresses the indirect challenge of constraining LyC escape during the Epoch of Reionization by applying Bayesian SED fitting with Prospector to local LyC-leaking analogues (LzLCS). It systematically tests multiple prior and dust-attenuation configurations, demonstrates robust recovery of in most cases, and derives a median escape fraction around 1% with some systems up to 70%, revealing that extreme LyC leakage does not always coincide with extreme global stellar properties. A symbolic regression analysis calibrated on synthetic Prospector data yields a concise relation that captures within uncertainties for the LzLCS subset, offering a practical estimator when full SED fitting is impractical. The study highlights both the potential of Bayesian SED modelling to constrain LyC leakage and the limitations imposed by nebular emission modelling and ISM geometry, while framing local analogues as a valuable bridge to understanding reionization-era galaxies. Overall, the work advances LyC diagnostics by combining rigorous inference with data-driven regression, informing how indirect tracers can be calibrated and applied in high-redshift contexts.

Abstract

Direct observations of Lyman continuum (LyC) radiation from galaxies during the Epoch of Reionization (EoR) are impeded by absorption in the intergalactic medium, requiring indirect methods to infer the escape fraction of ionizing photons (). One approach is to develop and validate such methods on local analogues of the high-redshift galaxies with directly detected LyC leakage. In this work, we constrain using a Bayesian spectral energy distribution (SED) fitting framework built on Prospector, which incorporates a non-parametric star-formation history, a flexible dust attenuation curve, self-consistent nebular emission, and fiber aperture-loss corrections. Our methodology jointly fits broadband photometry and emission line fluxes. We apply six models to the Low-redshift LyC Survey (LzLCS), a sample of local galaxies with properties comparable to EoR galaxies, and evaluate them based on their ability to recover the observed LyC flux and their relative Bayesian evidence. The best-performing model is further assessed through a parameter recovery test, demonstrating that can be recovered within uncertainties. Building on these results, we present updated estimates for the LzLCS sample, with a median of 0.3%, corresponding to very low leakage, and values reaching as high as 70%, with six of 64 galaxies having a cosmologically relevant (). Additionally, we present a revised UV -slope vs relation, derived using symbolic regression with PySR trained on a synthetic dataset generated from our best-performing model: . The relation successfully reproduces the obtained from full SED fitting of the LzLCS sample within uncertainties.

Paper Structure

This paper contains 25 sections, 4 equations, 17 figures, 5 tables.

Figures (17)

  • Figure 1: Schematic illustration of the three dust components in our models. Red and blue stars represent stellar populations older and younger than $10$ Myr, respectively. Red dashed lines indicate non-ionizing radiation, while blue lines indicate ionizing radiation. Light from the older stellar population is attenuated by dust$_3$ (optical depth $\tau_3$). For the young population, a fraction of the ionizing photons is absorbed by dust$_1$, while the fraction escaping this birth-cloud dust and nebular absorption is described by $f_\mathrm{run}^\mathrm{OB}$. All stellar light then passes through the diffuse dust screen dust$_2$. The fraction of ionizing photons that survives all dust attenuation corresponds to $f_\mathrm{esc}^\mathrm{LyC}$.
  • Figure 2: Comparison between the observed Lyman continuum fluxes from flury_low-redshift_2022 and the fluxes recovered with Prospector for the d2log20 model. The upper panel shows $F_\mathrm{LyC}^\mathrm{Prospector}$ as a function of the measured $F_\mathrm{LyC}$, with orange circles indicating detections and blue triangles representing upper limits. Recovered fluxes were measured in the same variable wavelength ranges as in flury_low-redshift_2022. Error bars indicate the 16th–84th percentile credible intervals, and the dashed line marks the one-to-one relation. The lower panel shows the pull, $[\log(F_\mathrm{LyC}^\mathrm{pred}) - \log(F_\mathrm{LyC}^\mathrm{obs})]/\sigma_{\log}$, for the detected sources. While in total 25 % of the galaxies deviate by more than $3\sigma$, only 20% of the galaxies with a detected LyC flux instead of just an upper limit are outliers, demonstrating good agreement between model predictions and $F_\mathrm{LyC}^\mathrm{obs}$. The white star with black outline marks J164849+495751, for which we recover $F_\mathrm{LyC}$ within $1\sigma$ and infer a significant $f_\mathrm{esc}^\mathrm{LyC}$.
  • Figure 3: Comparison between the $f_{\mathrm{esc, UV}}^\mathrm{LyC}$ from flury_low-redshift_2022, derived from template fits to the UV, and the LyC escape fraction ($f_{\mathrm{esc}}^\mathrm{LyC}$) derived with Prospector. The $f_\mathrm{esc}^\mathrm{LyC}$ presented here is derived based on the restricted wavelength window discussed in Section \ref{['sec:data']}. Each point represents a galaxy for which the best-fitting model was selected using a hybrid evidence--$\chi^2$ criterion. The dashed black line marks the 1:1 relation. For galaxies with the highest $f_\mathrm{esc}^\mathrm{LyC}$ in our analysis, previous estimates consistently report very low escape fractions. When focusing only on cases with $f_\mathrm{esc}^\mathrm{LyC} > 10^{-2}$, an almost anti-correlated trend emerges. The white star with black outline marks J164849+495751, for which we recover $F_\mathrm{LyC}$ within $1\sigma$ and infer a significant $f_\mathrm{esc}^\mathrm{LyC}$.
  • Figure 4: Posterior distributions (blue) for the best-fit model of galaxy J164849+495751, which has a high $f_\mathrm{esc}^\mathrm{LyC} \sim 11\%$. The diagonal panels show marginalized one-dimensional distributions, with vertical lines indicating the median and the 16th and 84th percentiles. Off-diagonal panels display the two-dimensional parameter correlations. We present directly fitted parameters (optical depth of the diffuse dust component, $f_\mathrm{run}^\mathrm{OB}$, gas-phase metallicity, and $f_\mathrm{scale}$), as well as the derived quantity $f_\mathrm{esc}^\mathrm{LyC}$, alongside their priors (orange). The comparison between posteriors and priors demonstrates that the fitted parameters are informed by the data and not solely driven by the assumed priors. As to be expected, we find a strong degeneracy between $f_\mathrm{run}^\mathrm{OB}$ and $f_\mathrm{esc}^\mathrm{LyC}$, since in this nearly dust-free system they effectively trace the same physical quantity.
  • Figure 5: (a) Stacked star formation histories of the LzLCS galaxies. Each individual SFH, normalized to its maximum, is shown as a thin blue line. The thick orange line marks the population median, with the shaded region denoting the 16th–84th percentile range. The steadily rising median toward recent times, together with the narrow percentile spread, demonstrates that nearly all galaxies in the sample are experiencing a strong, recent ($<10$ Myr) burst of star formation. (b) Distributions of the half-mass lookback time ($t_{50}$; when 50% of the stellar mass was formed) and the time of peak star formation ($t_{\mathrm{max}}$). Most galaxies have $t_{50}$ within the last $\sim$2 Gyr, while their peak activity lies within the past 10 Myr. This confirms that the LzLCS sample represents an intensely star-forming population dominated by very recent bursts.
  • ...and 12 more figures