Table of Contents
Fetching ...

Hybrid summary statistics: neural weak lensing inference beyond the power spectrum

T. Lucas Makinen, Alan Heavens, Natalia Porqueres, Tom Charnock, Axel Lapel, Benjamin D. Wandelt

TL;DR

This work addresses the information loss inherent in relying solely on two-point weak lensing statistics by introducing a hybrid approach that augments physics-based summaries with neural summaries optimized to add information beyond the power spectrum. Building on IMNN, the method learns neural compressions that complement an existing statistic, here the tomographic angular power spectrum, using an information-update Fisher framework. The authors implement a lightweight, physically-informed neural network with multipole kernel embeddings to produce additional summaries, and validate the approach with tomographic convergence maps across multiple resolutions and noise levels. Across both low- and high-noise regimes, the hybrid statistics achieve substantial gains in Fisher information (ranging from roughly 3× to over 8×) and yield tighter cosmological constraints, particularly on $oldsymbol{ heta}=(oldsymbol{ m \Omega_m}, S_8)$. The framework is simulation-based, scalable, and broadly applicable to other datasets, offering a path toward more efficient, interpretable implicit inferences for large-scale structure surveys.

Abstract

In inference problems, we often have domain knowledge which allows us to define summary statistics that capture most of the information content in a dataset. In this paper, we present a hybrid approach, where such physics-based summaries are augmented by a set of compressed neural summary statistics that are optimised to extract the extra information that is not captured by the predefined summaries. The resulting statistics are very powerful inputs to simulation-based or implicit inference of model parameters. We apply this generalisation of Information Maximising Neural Networks (IMNNs) to parameter constraints from tomographic weak gravitational lensing convergence maps to find summary statistics that are explicitly optimised to complement angular power spectrum estimates. We study several dark matter simulation resolutions in low- and high-noise regimes. We show that i) the information-update formalism extracts at least $3\times$ and up to $8\times$ as much information as the angular power spectrum in all noise regimes, ii) the network summaries are highly complementary to existing 2-point summaries, and iii) our formalism allows for networks with smaller, physically-informed architectures to match much larger regression networks with far fewer simulations needed to obtain asymptotically optimal inference.

Hybrid summary statistics: neural weak lensing inference beyond the power spectrum

TL;DR

This work addresses the information loss inherent in relying solely on two-point weak lensing statistics by introducing a hybrid approach that augments physics-based summaries with neural summaries optimized to add information beyond the power spectrum. Building on IMNN, the method learns neural compressions that complement an existing statistic, here the tomographic angular power spectrum, using an information-update Fisher framework. The authors implement a lightweight, physically-informed neural network with multipole kernel embeddings to produce additional summaries, and validate the approach with tomographic convergence maps across multiple resolutions and noise levels. Across both low- and high-noise regimes, the hybrid statistics achieve substantial gains in Fisher information (ranging from roughly 3× to over 8×) and yield tighter cosmological constraints, particularly on . The framework is simulation-based, scalable, and broadly applicable to other datasets, offering a path toward more efficient, interpretable implicit inferences for large-scale structure surveys.

Abstract

In inference problems, we often have domain knowledge which allows us to define summary statistics that capture most of the information content in a dataset. In this paper, we present a hybrid approach, where such physics-based summaries are augmented by a set of compressed neural summary statistics that are optimised to extract the extra information that is not captured by the predefined summaries. The resulting statistics are very powerful inputs to simulation-based or implicit inference of model parameters. We apply this generalisation of Information Maximising Neural Networks (IMNNs) to parameter constraints from tomographic weak gravitational lensing convergence maps to find summary statistics that are explicitly optimised to complement angular power spectrum estimates. We study several dark matter simulation resolutions in low- and high-noise regimes. We show that i) the information-update formalism extracts at least and up to as much information as the angular power spectrum in all noise regimes, ii) the network summaries are highly complementary to existing 2-point summaries, and iii) our formalism allows for networks with smaller, physically-informed architectures to match much larger regression networks with far fewer simulations needed to obtain asymptotically optimal inference.
Paper Structure (22 sections, 25 equations, 11 figures, 1 table)

This paper contains 22 sections, 25 equations, 11 figures, 1 table.

Figures (11)

  • Figure 1: Hybrid summary network schematic, illustrated for weak gravitational lensing. Noisy data (weak lensing $\boldsymbol{\kappa}_b$ with shape noise) are passed in parallel to an existing summary function (tomographic $C_\ell$ with optional MOPED compression) to produce summaries $\textbf{t}$, and a network (CNN) to output an additional set of summaries $\textbf{x}$, described in Section \ref{['sec:wl_stats']} and illustrated in Figure \ref{['fig:neuralnet']}. To train the network the Fisher information is first calculated for $\textbf{t}$ and then updated via Equation \ref{['eq:matrixupdate']} to yield $\textbf{F}$, for the loss Eq. \ref{['eq:imnn-loss']}.
  • Figure 2: We use a small convolutional neural network that exploits the data symmetries to compress $\kappa$ fields down to additional summaries. Input data (here of shape $(128,128,4)$) are passed through a residual multipole kernel layer (shared colour indicates shared weights) and then subsequently passed to convolutional blocks with varying strides with small $2\times2$ kernels to capture fluctuations on different scales. All linear layers are followed by a nonlinear activation function. Dashed arrows indicate feature concatenation at the same spatial resolution. This downsampling continues until the spatial resolution of the data reaches $8\times 8$, after which the output tensor is mean-pooled along the spatial axes and passed to three dense layers. The output from the network is a pair of numbers.
  • Figure 3: Cartoon of density estimation scheme with fixed compression (network or MOPED). Parameters $\boldsymbol{\theta}$ are drawn from a prior and MLE estimates $\hat{\boldsymbol{\theta}}$ are produced from data $\textbf{d}$ for either (fixed) compression method using Eq. \ref{['eq:mle-from-fisher']} and fed to a MAF neural density estimator for the amortised posterior distribution, trained under the loss in Eq. \ref{['eq:nde_loss']}.
  • Figure 4: Using information-update network summaries (green) drastically improves $\Omega_m - \sigma_8$ constraints beyond MOPED $C_\ell$ summaries in a low-noise setting. We compare the posteriors obtained from a KiDS-like survey truncation at $\ell_{\rm cut}=1500$ (blue) to the constraints from all available modes $\ell_{\rm cut}=6400$ at the given resolution (green). The network's additional summaries (dark green) is able to improve information extraction by a factor of 5 beyond the $\ell_{\rm cut}=6400$ and a factor of 8 above $\ell_{\rm cut} = 1500$.
  • Figure 5: Information-update network (bottom) makes simulations more distinguishable in summary space than $C_\ell$ compression (top). Points in parameter-summary space are coloured by the opposite parameter's value. The network finds patterns that separate these summaries in a complementary fashion even away from the fiducial point $(\Omega_m, S_8 )= (0.3, 0.8)$. We display a 3D view of this four-dimensional space in Figure \ref{['fig:3dsummaries']}.
  • ...and 6 more figures