Domain Adaptive Graph Neural Networks for Constraining Cosmological Parameters Across Multiple Data Sets
Andrea Roncoli, Aleksandra Ćiprijanović, Maggie Voetberg, Francisco Villaescusa-Navarro, Brian Nord
TL;DR
This work tackles domain shift in cosmological parameter inference by applying Domain Adaptive Graph Neural Networks (DA-GNNs) trained with a Maximum Mean Discrepancy (MMD) loss to CAMELS simulations from IllustrisTNG and SIMBA. The DA-GNNs encode structured, scale-free galaxy data into a latent space aligned across domains, enabling more robust predictions of $\Omega_m$ with uncertainty estimates. Cross-domain results show substantial improvements in accuracy and uncertainty calibration, supported by latent-space visualizations that demonstrate domain alignment. This approach advances robust cosmological inference for real survey data by mitigating simulation-to-observation biases without requiring labeled observational data.
Abstract
Deep learning models have been shown to outperform methods that rely on summary statistics, like the power spectrum, in extracting information from complex cosmological data sets. However, due to differences in the subgrid physics implementation and numerical approximations across different simulation suites, models trained on data from one cosmological simulation show a drop in performance when tested on another. Similarly, models trained on any of the simulations would also likely experience a drop in performance when applied to observational data. Training on data from two different suites of the CAMELS hydrodynamic cosmological simulations, we examine the generalization capabilities of Domain Adaptive Graph Neural Networks (DA-GNNs). By utilizing GNNs, we capitalize on their capacity to capture structured scale-free cosmological information from galaxy distributions. Moreover, by including unsupervised domain adaptation via Maximum Mean Discrepancy (MMD), we enable our models to extract domain-invariant features. We demonstrate that DA-GNN achieves higher accuracy and robustness on cross-dataset tasks (up to $28\%$ better relative error and up to almost an order of magnitude better $χ^2$). Using data visualizations, we show the effects of domain adaptation on proper latent space data alignment. This shows that DA-GNNs are a promising method for extracting domain-independent cosmological information, a vital step toward robust deep learning for real cosmic survey data.
