Table of Contents
Fetching ...

Neutral gas phase distribution from HI morphology: phase separation with scattering spectra and variational autoencoders

Minjie Lei, S. E. Clark, Rudy Morel, E. Allys, Iryna S. Butsky, Caleb Redshaw, Drummond B. Fielding

TL;DR

The paper addresses the problem of inferring HI phase structure (CNM vs WNM) from emission data in the absence of absorption constraints. It introduces a morphology-based data-driven framework that combines scattering spectra statistics with a Gaussian-mixture VAE to learn phase-specific morphologies and decompose HI emission into CNM, WNM, and noise in 3D PPV space, using only morphology. Key results show that the SS+VAE-derived CNM maps correlate well with existing spectrum-based maps while revealing more coherent small-scale structures, and that the method yields realistic, non-Gaussian phase realizations. The approach provides a new data-driven avenue for modeling Galactic HI phases and can be extended by incorporating spectral information to achieve full two-dimensional morphology plus spectroscopy in PPV.

Abstract

Unraveling the multi-phase structure of the diffuse interstellar medium (ISM) as traced by neutral hydrogen (HI) is essential to understanding the lifecycle of the Milky Way. However, HI phase separation is a challenging and under-constrained problem. The neutral gas phase distribution is often inferred from the spectral line structure of HI emission. In this work, we develop a data-driven phase separation method that extracts HI phase structure solely from the spatial morphology of HI emission intensity structures. We combine scattering spectra (SS) statistics with a Gaussian-mixture variational autoencoder (VAE) model to: 1. derive an interpretable statistical model of different HI phases from their multi-scale morphological structures; 2. use this model to decompose the 2D channel maps of GALFA-HI emission in diffuse high latitude ($|b|>30$\degree) regions over narrow velocity channels ($Δv=3$ km/s) into cold neutral medium (CNM), warm neutral medium (WNM), and noise components. We integrate our CNM map over velocity channels to compare it to an existing map produced by a spectrum-based method, and find that the two maps are highly correlated, while ours recovers more spatially coherent structures at small scales. Our work illustrates and quantifies a clear physical connection between the HI morphology and HI phase structure, and unlocks a new avenue for improving future phase separation techniques by making use of both HI spectral and spatial information to decompose HI in 3D position-position-velocity (PPV) space. These results are consistent with a physical picture where processes that drive HI phase transitions also shape the morphology of HI gas, imprinting a sparse, filamentary CNM that forms out of a diffuse, extended WNM.

Neutral gas phase distribution from HI morphology: phase separation with scattering spectra and variational autoencoders

TL;DR

The paper addresses the problem of inferring HI phase structure (CNM vs WNM) from emission data in the absence of absorption constraints. It introduces a morphology-based data-driven framework that combines scattering spectra statistics with a Gaussian-mixture VAE to learn phase-specific morphologies and decompose HI emission into CNM, WNM, and noise in 3D PPV space, using only morphology. Key results show that the SS+VAE-derived CNM maps correlate well with existing spectrum-based maps while revealing more coherent small-scale structures, and that the method yields realistic, non-Gaussian phase realizations. The approach provides a new data-driven avenue for modeling Galactic HI phases and can be extended by incorporating spectral information to achieve full two-dimensional morphology plus spectroscopy in PPV.

Abstract

Unraveling the multi-phase structure of the diffuse interstellar medium (ISM) as traced by neutral hydrogen (HI) is essential to understanding the lifecycle of the Milky Way. However, HI phase separation is a challenging and under-constrained problem. The neutral gas phase distribution is often inferred from the spectral line structure of HI emission. In this work, we develop a data-driven phase separation method that extracts HI phase structure solely from the spatial morphology of HI emission intensity structures. We combine scattering spectra (SS) statistics with a Gaussian-mixture variational autoencoder (VAE) model to: 1. derive an interpretable statistical model of different HI phases from their multi-scale morphological structures; 2. use this model to decompose the 2D channel maps of GALFA-HI emission in diffuse high latitude (\degree) regions over narrow velocity channels ( km/s) into cold neutral medium (CNM), warm neutral medium (WNM), and noise components. We integrate our CNM map over velocity channels to compare it to an existing map produced by a spectrum-based method, and find that the two maps are highly correlated, while ours recovers more spatially coherent structures at small scales. Our work illustrates and quantifies a clear physical connection between the HI morphology and HI phase structure, and unlocks a new avenue for improving future phase separation techniques by making use of both HI spectral and spatial information to decompose HI in 3D position-position-velocity (PPV) space. These results are consistent with a physical picture where processes that drive HI phase transitions also shape the morphology of HI gas, imprinting a sparse, filamentary CNM that forms out of a diffuse, extended WNM.

Paper Structure

This paper contains 26 sections, 12 equations, 12 figures.

Figures (12)

  • Figure 1: A flow chart describing the data-driven, morphology-based phase separation framework. The method is composed of a feature extraction step where we compute the SS statistics of H$\;$ emission channel maps $I(x)$, reducing the input into a more compact but still interpretable representation; a feature clustering step utilizing Gaussian-mixture VAE learn a component-by-component statistical model of H$\;$ phases from the SS statistics; and finally a component separation step where VAE outputs are used as priors to inform phase separation via gradient descent synthesis.
  • Figure 2: Distribution of the SS summary statistics for the identified VAE clusters compared to that of the true phase distribution from the F23 simulation. The top left and right panels correspond to power and sparsity, while the bottom panel correspond to filamentarity, as specified by Equations \ref{['eq:power']}-\ref{['eq:linearity']}. The three distinct components identified by the VAE model agree very well in the SS representation with the CNM, WNM+UNM, and noise respectively.
  • Figure 3: Image realizations synthesized from the SS representations identified by the VAE model from the F23 simulation dataset, compared to the ground truth synthetic CNM, WNM+UNM, and noise patches. The left panels are the ground truth synthetic H$\;$ images while the right panels correspond to the SS+VAE synthesis. The top, middle, and bottom rows corresponds to CNM, WNM+UNM, and noise respectively. All images are normalized to have zero mean and unit standard deviation.
  • Figure 4: Result of SS+VAE CNM separation performed on the F23 simulation dataset compared to the true CNM distribution for a few different velocity channels of a sample sightline. Left: original synthetic H$\;$ emission. Middle: True CNM column density. Right: CNM column density predicted by the SS+VAE model. The last panel shows the pixel-by-pixel correlation between the true and predicted CNM column density, with the white dashed line indicating one-to-one correlation.
  • Figure 5: Distribution of the SS summary statistics for the distinct clusters identified by the VAE model from the GALFA-H$\;$ dataset. The top left and right panels correspond to power and sparsity, while the bottom panel correspond to filamentarity, as specified by Equations \ref{['eq:power']}-\ref{['eq:linearity']}. The morphological interpretations behind each coefficient shows a consistent picture where cluster 0 describes a sparse, filamentary CNM, cluster 1 describes a diffuse WNM, and cluster 2 corresponds to the noise component. The qualitative scale-dependent behavior of each cluster is also consistent with the simulation results in Figure \ref{['fig:prior_vs_true_plasmoid']}.
  • ...and 7 more figures