Sweeping the Dust Away -- Correcting the Phase Space Density of the Milky Way with Unsupervised Machine Learning
Eric Putney, David Shih, Sung Hak Lim, Matthew R. Buckley
TL;DR
This work addresses how dust extinction biases the measurement of the Milky Way's gravitational potential via the Boltzmann equation by jointly inferring the dust-corrected phase-space density and the potential. It introduces a data-driven framework that learns a dust efficiency factor $\epsilon(\vec{x})$ and a gravitational potential $\Phi(\vec{x})$ using neural networks, trained on Gaia DR3 RC/RGB kinematics, and enforces the collisionless Boltzmann equation as a training objective. The method leverages Masked Autoregressive Flows to model the observed PSD, and couples this with NN parameterizations for $\epsilon$ and $\Phi$, regularized to maintain physical plausibility. The results show that the learned $\epsilon$ maps align with a state-of-the-art 3D dust map and enable a dust-corrected PSD $f_{\rm corr}$, revealing a more coherent disk structure and enabling reliable dynamical inferences in the disk volume; uncertainties are carefully quantified via ensemble trainings. Overall, this approach provides a data-driven, physically constrained path to disentangle dust effects from stellar dynamics, paving the way for robust measurements of the Galactic potential and dark matter distribution, with a companion paper outlining the corresponding acceleration and mass-density inferences.
Abstract
The Boltzmann equation relates the equilibrium phase space distribution of stars in the Milky Way to the Galaxy's gravitational potential. However, observations of stellar populations are biased by extinction from foreground dust, which complicates measurements of the potential in the disk and towards the Galactic center. Using the kinematics of Red Clump and Red Branch stars in Gaia DR3, we use machine learning to simultaneously estimate both the unbiased stellar phase space density and the gravitational potential. The unbiased phase space density is obtained through a learned "dust efficiency factor" -- an observational selection function that accounts for dust extinction. The potential and the dust efficiency are parameterized by fully connected neural networks and are completely data driven. We validate the dust efficiency using a recent three-dimensional dust map in this work, and examine the potential in a companion paper.
