Table of Contents
Fetching ...

Bayesian Posteriors with Stellar Population Synthesis on GPUs

Georgios Zacharegkas, Andrew Hearin, Andrew Benson

TL;DR

The paper tackles the computational bottleneck of deriving Bayesian SPS posteriors for large galaxy surveys. It proposes a trio of strategies—approximate photometry to reduce per-epoch cost, GPU-accelerated forward modeling via DSPS, and gradient-based inference with HMC/NUTS—to exploit parallel architectures. It demonstrates significant speedups, including ~50x faster photometry, and ~1000 posteriors per minute on a single GPU, enabling scalable inference on sizable samples. It also discusses limitations for LSST-scale datasets and highlights the potential of AI-based emulators and population-level or amortized inference, with the associated public code.

Abstract

Models of Stellar Population Synthesis (SPS) provide a predictive framework for the spectral energy distribution (SED) of a galaxy. SPS predictions can be computationally intensive, creating a bottleneck for attempts to infer the physical properties of large populations of individual galaxies from their SEDs and photometry; these computational challenges are especially daunting for near-future cosmology surveys that will measure the photometry of billions of galaxies. In this paper, we explore a range of computational techniques aimed at accelerating SPS predictions of galaxy photometry using the JAX library to target GPUs. We study a particularly advantageous approximation to the calculation of galaxy photometry that speeds up the computation by a factor of 50 relative to the exact calculation. We introduce a novel technique for incorporating burstiness into models of galaxy star formation history that captures very short-timescale fluctuations with negligible increase in computation time. We study the performance of Hamiltonian Monte Carlo (HMC) algorithms in which individual chains are parallelized across independent GPU threads, finding that our pipeline can carry out Bayesian inference at a rate of approximately $1000$ galaxy posteriors per minute on a single GPU. Our results provide an update to standard benchmarks in the literature on the computational demands of SPS inference; our publicly available code enables previously-impractical Bayesian analyses of large galaxy samples, and includes several standalone modules that could be adopted to speedup existing SPS pipelines.

Bayesian Posteriors with Stellar Population Synthesis on GPUs

TL;DR

The paper tackles the computational bottleneck of deriving Bayesian SPS posteriors for large galaxy surveys. It proposes a trio of strategies—approximate photometry to reduce per-epoch cost, GPU-accelerated forward modeling via DSPS, and gradient-based inference with HMC/NUTS—to exploit parallel architectures. It demonstrates significant speedups, including ~50x faster photometry, and ~1000 posteriors per minute on a single GPU, enabling scalable inference on sizable samples. It also discusses limitations for LSST-scale datasets and highlights the potential of AI-based emulators and population-level or amortized inference, with the associated public code.

Abstract

Models of Stellar Population Synthesis (SPS) provide a predictive framework for the spectral energy distribution (SED) of a galaxy. SPS predictions can be computationally intensive, creating a bottleneck for attempts to infer the physical properties of large populations of individual galaxies from their SEDs and photometry; these computational challenges are especially daunting for near-future cosmology surveys that will measure the photometry of billions of galaxies. In this paper, we explore a range of computational techniques aimed at accelerating SPS predictions of galaxy photometry using the JAX library to target GPUs. We study a particularly advantageous approximation to the calculation of galaxy photometry that speeds up the computation by a factor of 50 relative to the exact calculation. We introduce a novel technique for incorporating burstiness into models of galaxy star formation history that captures very short-timescale fluctuations with negligible increase in computation time. We study the performance of Hamiltonian Monte Carlo (HMC) algorithms in which individual chains are parallelized across independent GPU threads, finding that our pipeline can carry out Bayesian inference at a rate of approximately galaxy posteriors per minute on a single GPU. Our results provide an update to standard benchmarks in the literature on the computational demands of SPS inference; our publicly available code enables previously-impractical Bayesian analyses of large galaxy samples, and includes several standalone modules that could be adopted to speedup existing SPS pipelines.

Paper Structure

This paper contains 15 sections, 21 equations, 9 figures.

Figures (9)

  • Figure 1: Comparison between running a HMC inference using the approximate and exact photometry model, as described in Section \ref{['sec_Photometry']}. In this plot we show 3 of the 12 model parameters as a demonstration of how close the two runs look. The full 12-parameter plot is shown in Figure \ref{['appfig:ApproxModelTestAllParams']} in Appendix \ref{['appendix_ApproxPhotoModel']}.
  • Figure 1: Stellar age PDF for a generated galaxy without burstiness ($F_{\rm burst}=0$, dashed line), compared to models with burstiness (solid lines), for a range of $F_{\rm burst}$ parameter values, color-coded as shown in the colorbar.
  • Figure 1: Synthetic gaussian transmission bands used to produce synthetic data and during the model fits.
  • Figure 1: Comparison between HMC, NUTS and MCMC, showing all 12 model parameters.
  • Figure 2: Runtime comparisons between running on a single GPU versus on a CPU as a function of number of fits we perform in parallel. We have run our HMC and NUTS pipelines for $3000$ steps, with $100$ steps of adaptation on both machines, and under the same exact configurations, fitting our full 12-parameter approximate photometry model to the same target data.
  • ...and 4 more figures