Using Galaxy Evolution as Source of Physics-Based Ground Truth for Generative Models
Yun Qi Li, Tuan Do, Evan Jones, Bernie Boscoe, Kevin Alfaro, Zooey Nguyen
TL;DR
This work treats galaxy evolution as physics-based ground truth to evaluate generative image models. It develops two conditional generative architectures, a denoising diffusion model (DDPM) and a variational autoencoder (CVAE), conditioned on redshift $z$, and introduces physics-inspired metrics (galaxy KL loss, galaxy-fitting loss, redshift loss) alongside standard IS/FID to quantify realism. Across a $z$-ranging galaxy dataset from Hyper Suprime-Cam, the DDPM generally outperforms the CVAE on physics-based metrics, especially at higher redshifts, though neither model reliably recovers the conditioned redshift or fully captures low-redshift diversity. The study demonstrates that physics-grounded evaluation can reveal strengths and limitations of generative models beyond human perceptual judgments, guiding future improvements in physics-aware image generation.
Abstract
Generative models producing images have enormous potential to advance discoveries across scientific fields and require metrics capable of quantifying the high dimensional output. We propose that astrophysics data, such as galaxy images, can test generative models with additional physics-motivated ground truths in addition to human judgment. For example, galaxies in the Universe form and change over billions of years, following physical laws and relationships that are both easy to characterize and difficult to encode in generative models. We build a conditional denoising diffusion probabilistic model (DDPM) and a conditional variational autoencoder (CVAE) and test their ability to generate realistic galaxies conditioned on their redshifts (galaxy ages). This is one of the first studies to probe these generative models using physically motivated metrics. We find that both models produce comparable realistic galaxies based on human evaluation, but our physics-based metrics are better able to discern the strengths and weaknesses of the generative models. Overall, the DDPM model performs better than the CVAE on the majority of the physics-based metrics. Ultimately, if we can show that generative models can learn the physics of galaxy evolution, they have the potential to unlock new astrophysical discoveries.
