Table of Contents
Fetching ...

A point cloud approach to generative modeling for galaxy surveys at the field level

Carolina Cuesta-Lazaro, Siddharth Mishra-Sharma

TL;DR

The paper addresses the challenge of modeling the full 3-D distribution of galaxies as a point cloud and inferring cosmological parameters via a conditional likelihood $p(x|\theta)$. It introduces a diffusion-based generative model with backbones based on graph neural networks or transformers to learn the forward noise process $q(z_t|x)$ and the reverse denoiser $p_{\varphi}(z_{t-1}|z_t)$ (or score $\nabla_{z_t}\log p(z_t)$), enabling both emulation and likelihood-based inference. The method is demonstrated on Quijote halo catalogs, where the model reproduces key summary statistics and captures cosmology-dependent behavior, while avoiding information loss from voxelization. This point-cloud diffusion approach offers a path toward comprehensive, in-situ cosmological analyses that extend beyond traditional summary statistics and image-based methods.

Abstract

We introduce a diffusion-based generative model to describe the distribution of galaxies in our Universe directly as a collection of points in 3-D space (coordinates) optionally with associated attributes (e.g., velocities and masses), without resorting to binning or voxelization. The custom diffusion model can be used both for emulation, reproducing essential summary statistics of the galaxy distribution, as well as inference, by computing the conditional likelihood of a galaxy field. We demonstrate a first application to massive dark matter haloes in the Quijote simulation suite. This approach can be extended to enable a comprehensive analysis of cosmological data, circumventing limitations inherent to summary statistic -- as well as neural simulation-based inference methods.

A point cloud approach to generative modeling for galaxy surveys at the field level

TL;DR

The paper addresses the challenge of modeling the full 3-D distribution of galaxies as a point cloud and inferring cosmological parameters via a conditional likelihood . It introduces a diffusion-based generative model with backbones based on graph neural networks or transformers to learn the forward noise process and the reverse denoiser (or score ), enabling both emulation and likelihood-based inference. The method is demonstrated on Quijote halo catalogs, where the model reproduces key summary statistics and captures cosmology-dependent behavior, while avoiding information loss from voxelization. This point-cloud diffusion approach offers a path toward comprehensive, in-situ cosmological analyses that extend beyond traditional summary statistics and image-based methods.

Abstract

We introduce a diffusion-based generative model to describe the distribution of galaxies in our Universe directly as a collection of points in 3-D space (coordinates) optionally with associated attributes (e.g., velocities and masses), without resorting to binning or voxelization. The custom diffusion model can be used both for emulation, reproducing essential summary statistics of the galaxy distribution, as well as inference, by computing the conditional likelihood of a galaxy field. We demonstrate a first application to massive dark matter haloes in the Quijote simulation suite. This approach can be extended to enable a comprehensive analysis of cosmological data, circumventing limitations inherent to summary statistic -- as well as neural simulation-based inference methods.
Paper Structure (4 sections, 1 figure)

This paper contains 4 sections, 1 figure.

Figures (1)

  • Figure 1: A schematic overview of the point cloud diffusion model, showing samples from the diffusion process at different diffusion times. During training, noise is added to a data sample $x$ using the diffusion kernel $q(z_t\mid x)$ and a denoising distribution $p_\varphi\left(z_{t-1} \mid z_t\right)$ is learned. To generate samples, we simulate the reverse process -- we sample noise from a standard Gaussian distribution and denoise it iteratively using the learned denoising distribution. https://github.com/smsharma/point-cloud-galaxy-diffusion/blob/arXiv-v1/notebooks/04_viz_pos.ipynb