Table of Contents
Fetching ...

Lightweight posterior construction for gravitational-wave catalogs with the Kolmogorov-Arnold network

Wenshuai Liu, Yiming Dong, Ziming Wang, Lijing Shao

TL;DR

This paper tackles the data-storage and transmission burden of gravitational-wave posterior catalogs by introducing a Kolmogorov-Arnold network (KAN)–based neural density estimator. By replacing fixed activations with learnable edge-wise splines, KAN yields interpretable, efficient density models whose posteriors can be reconstructed from compact neural network weights or closed-form analytic expressions, drastically reducing storage needs. The authors demonstrate two data products—high-fidelity neural network weights and analytic PDF expressions—that enable rapid posterior resampling with minimal loss of fidelity, validated on verifiable benchmark distributions, real GW posteriors, and hierarchical population inference. This lightweight catalog framework promises scalable, real-time, and transmission-efficient GW analyses for next-generation detectors, while preserving the statistical accuracy required for population studies. Key mathematical quantities include the posterior P(oldsymbol{ heta}|d) and the evidence Z(d), both of which are preserved in the surrogate representations.

Abstract

Neural density estimation has seen widespread applications in the gravitational-wave (GW) data analysis, which enables real-time parameter estimation for compact binary coalescences and enhances rapid inference for subsequent analysis such as population inference. In this work, we explore the application of using the Kolmogorov-Arnold network (KAN) to construct efficient and interpretable neural density estimators for lightweight posterior construction of GW catalogs. By replacing conventional activation functions with learnable splines, KAN achieves superior interpretability, higher accuracy, and greater parameter efficiency on related scientific tasks. Leveraging this feature, we propose a KAN-based neural density estimator, which ingests megabyte-scale GW posterior samples and compresses them into model weights of tens of kilobytes. Subsequently, analytic expressions requiring only several kilobytes can be further distilled from these neural network weights with minimal accuracy trade-off. In practice, GW posterior samples with fidelity can be regenerated rapidly using the model weights or analytic expressions for subsequent analysis. Our lightweight posterior construction strategy is expected to facilitate user-level data storage and transmission, paving a path for efficient analysis of numerous GW events in the next-generation GW detectors.

Lightweight posterior construction for gravitational-wave catalogs with the Kolmogorov-Arnold network

TL;DR

This paper tackles the data-storage and transmission burden of gravitational-wave posterior catalogs by introducing a Kolmogorov-Arnold network (KAN)–based neural density estimator. By replacing fixed activations with learnable edge-wise splines, KAN yields interpretable, efficient density models whose posteriors can be reconstructed from compact neural network weights or closed-form analytic expressions, drastically reducing storage needs. The authors demonstrate two data products—high-fidelity neural network weights and analytic PDF expressions—that enable rapid posterior resampling with minimal loss of fidelity, validated on verifiable benchmark distributions, real GW posteriors, and hierarchical population inference. This lightweight catalog framework promises scalable, real-time, and transmission-efficient GW analyses for next-generation detectors, while preserving the statistical accuracy required for population studies. Key mathematical quantities include the posterior P(oldsymbol{ heta}|d) and the evidence Z(d), both of which are preserved in the surrogate representations.

Abstract

Neural density estimation has seen widespread applications in the gravitational-wave (GW) data analysis, which enables real-time parameter estimation for compact binary coalescences and enhances rapid inference for subsequent analysis such as population inference. In this work, we explore the application of using the Kolmogorov-Arnold network (KAN) to construct efficient and interpretable neural density estimators for lightweight posterior construction of GW catalogs. By replacing conventional activation functions with learnable splines, KAN achieves superior interpretability, higher accuracy, and greater parameter efficiency on related scientific tasks. Leveraging this feature, we propose a KAN-based neural density estimator, which ingests megabyte-scale GW posterior samples and compresses them into model weights of tens of kilobytes. Subsequently, analytic expressions requiring only several kilobytes can be further distilled from these neural network weights with minimal accuracy trade-off. In practice, GW posterior samples with fidelity can be regenerated rapidly using the model weights or analytic expressions for subsequent analysis. Our lightweight posterior construction strategy is expected to facilitate user-level data storage and transmission, paving a path for efficient analysis of numerous GW events in the next-generation GW detectors.

Paper Structure

This paper contains 12 sections, 13 equations, 12 figures, 5 tables.

Figures (12)

  • Figure 1: Flowchart for lightweight GW catalog construction. Starting from raw posterior samples of a single GW event as our training set, we first train the KAN-based neural density estimator to minimize the average negative log-posterior. As shown in the box at the bottom left, we replace MADE's unmasked edges with KAN's learnable nonlinear edges (a linear combination of B-spline curves). Once the loss has converged, the analytic expression of each unmasked edge is obtained and synthesized through the network topology to get analytic expressions of all output nodes: $\ln(a_{ij})$, $\mu_{ij}$, and $-2\ln(\sigma_{ij})$; see main text for details. Expressions of all output nodes are combined to obtain the analytic joint PDF. After training and symbolification, we store the neural network weights, typically of size $\mathcal{O}\,(10^{1} \, {\rm KB})$, and analytic expressions of output nodes, typically of size $\mathcal{O}\,(1 \, {\rm KB})$, as compact data products. Users can download these data products and use them to compute probability density or resample posterior samples for downstream analysis tasks.
  • Figure 2: Results of three verifiable cases. Marginalized one- and two-dimensional distributions are shown in corner plots, comparing samples resampled from fitted expressions (orange) and samples generated from ground-truth distributions (blue). Both of them contain 40,000 samples. In Case 2, $\phi$ is fixed at 0 during the sampling process. Contour lines in the two-dimensional joint distributions delineate the $50\%$ and $90\%$ credible regions. The second row displays the ground-truth PDFs of the three cases, where $\mathcal{N}(\mu,\sigma^2)$ denotes a Gaussian distribution with mean value $\mu$ and variance $\sigma^2$. The third row shows the analytic expressions of the fitted PDFs chosen by our KAN-based neural density estimator where the coefficients are rounded to two decimal places.
  • Figure 3: Results of KDE for Case 3. The 110 samples are utilized in a Gaussian kernel with a bandwidth of 0.35. The KDE is accomplished using the scikit-learn package. Marginalized one- and two-dimensional distributions are shown in the corner plot, comparing samples resampled using KDE (red) and samples generated from the ground-true distribution (blue). Both of them contain 40,000 samples. Contour lines in the two-dimensional joint distributions delineate the $50\%$ and $90\%$ credible regions.
  • Figure 4: Marginalized one- and two-dimensional distributions, comparing samples generated with the neural network weights (orange) and raw samples (blue) of GW150914. Both of them contain 147,634 samples. Contour lines in the two-dimensional joint distributions delineate the $50\%$ and $90\%$ credible regions.
  • Figure 5: Same as Fig. \ref{['fig:5']}, but for samples generated with analytic expressions (orange), the fitted multivariate Gaussian distribution (grey), and raw samples (blue).
  • ...and 7 more figures