Neural Network Prediction of Strong Lensing Systems with Domain Adaptation and Uncertainty Quantification
Shrihan Agarwal, Aleksandra Ćiprijanović, Brian D. Nord
TL;DR
The paper tackles the problem of predicting the strong-lensing Einstein radius $\theta_E$ with reliable uncertainty estimates when transitioning from simulated (source) to real (target) data. It combines Mean-Variance Estimators (MVEs) with Unsupervised Domain Adaptation (UDA) via a total loss $L_{\mathrm{Tot}} = \mathcal{L}_{\beta-\mathrm{NLL}}(\beta_{\mathrm{NLL}}) + \alpha_{\mathrm{UDA}} L_{\mathrm{UDA}}$, aligning latent representations with Maximum Mean Discrepancy (MMD) and training on source labels. The key finding is that UDA significantly improves target-domain accuracy and calibration, with mean residuals dropping from $0.0818$ arcsec to $0.0425$ arcsec and uncertainties becoming better calibrated, while source and target embeddings align under MVE-UDA. This approach paves the way for applying MVEs to real observational data from surveys like DES and future projects, enhancing the practical utility of neural-network-based lens modeling for cosmology.
Abstract
Modeling strong gravitational lenses is computationally expensive for the complex data from modern and next-generation cosmic surveys. Deep learning has emerged as a promising approach for finding lenses and predicting lensing parameters, such as the Einstein radius. Mean-variance Estimators (MVEs) are a common approach for obtaining aleatoric (data) uncertainties from a neural network prediction. However, neural networks have not been demonstrated to perform well on out-of-domain target data successfully - e.g., when trained on simulated data and applied to real, observational data. In this work, we perform the first study of the efficacy of MVEs in combination with unsupervised domain adaptation (UDA) on strong lensing data. The source domain data is noiseless, and the target domain data has noise mimicking modern cosmology surveys. We find that adding UDA to MVE increases the accuracy on the target data by a factor of about two over an MVE model without UDA. Including UDA also permits much more well-calibrated aleatoric uncertainty predictions. Advancements in this approach may enable future applications of MVE models to real observational data.
