Table of Contents
Fetching ...

Waging a Campaign: Results from an Injection-Recovery Study involving 35 numerical Relativity Simulations and three Waveform Models

Sarp Akçay, Charlie Hoy, Jake Mac Uilliam

TL;DR

This paper assesses the accuracy of three state-of-the-art precessing gravitational-wave waveform models (SEOBNRv5PHM, IMRPhenomTPHM, IMRPhenomXPHM) through an extensive injection-recovery campaign using 35 strongly precessing NR BBH simulations, analyzed with a two-detector network and O4-design sensitivity. It quantifies biases via recovery scores and IMR consistency tests, revealing that SEOBNRv5PHM generally provides the most reliable parameter recovery for mass and mass ratio up to $Q\le 4$, while IMRPhenomTPHM shows strong robustness in IMR consistency, and IMRPhenomXPHM exhibits model-dependent biases, especially at high mass ratios. The IMR consistency results depend on the chosen cutoff frequency, with Kerr ISCO cutoffs often reducing apparent GR deviations compared to Schwarzschild cutoffs; no single model consistently passes IMRCT for $Q=8$ injections. The study also demonstrates that incorporating model accuracy into Bayesian inference (NR-informed model averaging) yields more accurate, less biased inferences than equal-weight or evidence-based model combinations, guiding future multi-model PE strategies. Overall, the work informs waveform-model development, motivates multi-model analyses, and provides datasets (NR injections) to benchmark next-generation precessing waveform models.

Abstract

We present Bayesian inference results from an extensive injection-recovery campaign to test the validity of three state of the art quasicircular gravitational waveform models: \textsc{SEOBNRv5PHM}, \textsc{IMRPhenomTPHM}, \textsc{IMRPhenomXPHM}, the latter with the \textsc{SpinTaylorT4} implementation for its precession dynamics. We analyze 35 strongly precessing binary black hole numerical relativity simulations with all available harmonic content. Ten simulations have a mass ratio of $4:1$ and five, mass ratio of $8:1$. Overall, we find that \textsc{SEOBNRv5PHM} is the most consistent model to numerical relativity, with the majority of true source properties lying within the inferred 90\% credible interval. However, we find that none of the models can reliably infer the true source properties for binaries with mass ratio $8:1$ systems. We additionally conduct inspiral-merger-ringdown (IMR) consistency tests to determine if our chosen state of the art waveform models infer consistent properties when analysing only the inspiral (low frequency) and ringdown (high frequency) portions of the signal. For the simulations considered in this work, we find that the IMR consistency test depends on the frequency that separates the inspiral and ringdown regimes. For two sensible choices of the cutoff frequency, we report that \textsc{IMRPhenomXPHM} can produce false GR deviations. Meanwhile, we find that \textsc{IMRPhenomTPHM} is the most reliable model under the IMR consistency test. Finally, we re-analyze the same 35 simulations, but this time we incorporate model accuracy into our Bayesian inference. Consistent with the work in Hoy et al. 2024 [arXiv: 2409.19404], we find this approach generally yields more accurate inferred properties for binary black holes with less biases compared to methods that combine model-dependent posterior distributions based on their evidence, or with equal weight.

Waging a Campaign: Results from an Injection-Recovery Study involving 35 numerical Relativity Simulations and three Waveform Models

TL;DR

This paper assesses the accuracy of three state-of-the-art precessing gravitational-wave waveform models (SEOBNRv5PHM, IMRPhenomTPHM, IMRPhenomXPHM) through an extensive injection-recovery campaign using 35 strongly precessing NR BBH simulations, analyzed with a two-detector network and O4-design sensitivity. It quantifies biases via recovery scores and IMR consistency tests, revealing that SEOBNRv5PHM generally provides the most reliable parameter recovery for mass and mass ratio up to , while IMRPhenomTPHM shows strong robustness in IMR consistency, and IMRPhenomXPHM exhibits model-dependent biases, especially at high mass ratios. The IMR consistency results depend on the chosen cutoff frequency, with Kerr ISCO cutoffs often reducing apparent GR deviations compared to Schwarzschild cutoffs; no single model consistently passes IMRCT for injections. The study also demonstrates that incorporating model accuracy into Bayesian inference (NR-informed model averaging) yields more accurate, less biased inferences than equal-weight or evidence-based model combinations, guiding future multi-model PE strategies. Overall, the work informs waveform-model development, motivates multi-model analyses, and provides datasets (NR injections) to benchmark next-generation precessing waveform models.

Abstract

We present Bayesian inference results from an extensive injection-recovery campaign to test the validity of three state of the art quasicircular gravitational waveform models: \textsc{SEOBNRv5PHM}, \textsc{IMRPhenomTPHM}, \textsc{IMRPhenomXPHM}, the latter with the \textsc{SpinTaylorT4} implementation for its precession dynamics. We analyze 35 strongly precessing binary black hole numerical relativity simulations with all available harmonic content. Ten simulations have a mass ratio of and five, mass ratio of . Overall, we find that \textsc{SEOBNRv5PHM} is the most consistent model to numerical relativity, with the majority of true source properties lying within the inferred 90\% credible interval. However, we find that none of the models can reliably infer the true source properties for binaries with mass ratio systems. We additionally conduct inspiral-merger-ringdown (IMR) consistency tests to determine if our chosen state of the art waveform models infer consistent properties when analysing only the inspiral (low frequency) and ringdown (high frequency) portions of the signal. For the simulations considered in this work, we find that the IMR consistency test depends on the frequency that separates the inspiral and ringdown regimes. For two sensible choices of the cutoff frequency, we report that \textsc{IMRPhenomXPHM} can produce false GR deviations. Meanwhile, we find that \textsc{IMRPhenomTPHM} is the most reliable model under the IMR consistency test. Finally, we re-analyze the same 35 simulations, but this time we incorporate model accuracy into our Bayesian inference. Consistent with the work in Hoy et al. 2024 [arXiv: 2409.19404], we find this approach generally yields more accurate inferred properties for binary black holes with less biases compared to methods that combine model-dependent posterior distributions based on their evidence, or with equal weight.

Paper Structure

This paper contains 16 sections, 27 equations, 12 figures, 7 tables.

Figures (12)

  • Figure 1: Left panel: The precession of $\mathbf{S}_1$ and $\mathbf{S}_2$ around the total angular momentum vector $\mathbf{J}$ which points out of the page through the origin. As the spin vectors precess, they trace out precession cones around $\mathbf{J}$ whose projections are plotted in the figure as the blue and orange curves. The small fluctuations in the trajectories are due to nutation. As the mass ratio of this system (SXS:BBH:1200) is $2:1$ and the dimensionless spins are equal, the magnitude of $\mathbf{S}_1$ is approximately $2^2=4$ times larger than the magnitude of $\mathbf{S}_2$. The dots mark the initial projections of the spin vectors. Right panel: The numerical relativity waveform strain $\mathrm{Re}(h_\text{NR}(t))$ for this simulation seen both in the $\mathbf{L}$ frame (top row) and the $\mathbf{J}$ frame (bottom row). In this figure, the time units have been geometrized (adimensionalized) via $G=c=M=1$. For $M=53.6M_\odot$ given for this system in Table \ref{['tab:SXS_sim_params']}, a time interval of $4000M$ converts to $1.06\,$ second. Though the waveforms in the two frames look very similar, they are not identical as can be seen in the lower right subfigure where we overlaid the orange $\mathbf{L}$-frame merger-ringdown waveform on top of its $\mathbf{J}$-frame counterpart (blue).
  • Figure 2: One-dimensional marginalized posterior distributions obtained for the inferred total detector-frame mass (first column), mass ratio (second column), effective parallel (third column) and effective perpendicular spins (fourth column) for the 30 SXS binary black hole simulations used throughout this work. The red vertical lines indicate the true values.
  • Figure 3: Same as Fig. \ref{['fig:three_model_posteriors']} but for the five single-spin $Q=8$BAM binary black hole simulations of Sec. \ref{['sec:Q8_injections']}. The red vertical lines indicate the true values.
  • Figure 4: The posteriors for $\Delta M_f/\bar{M}_f$ (top row) and $\Delta a_f/\bar{a}_f$ (bottom row) from the IMR consistency test performed with XPHM for the 30 $Q\le 4$ numerical relativity simulations whose parameters are given in Tables \ref{['tab:SXS_sim_params']} and \ref{['tab:SXS_extrinsic_params']}. See Eqs. (\ref{['eq:M_f_bar']}-\ref{['eq:Delta_a_f']}) for the relevant definitions. Each row is further grouped into three clusters of 10 subfigures each separated by the mass ratio of the binary black hole simulations with $Q=1,2,4$ clusters from left to right. Note that the vertical scale is different in each cluster. In each plot, we show the IMR results obtained with two different choices for the cutoff frequency, see Eqs. (\ref{['eq:Sch_ISCO_freq']},\ref{['eq:Kerr_ISCO_freq']}). In orange (cyan) we show the results obtained with a cutoff frequency equal to the Schwarzschild (Kerr) ISCO frequency.
  • Figure 5: Model performance under the IMR consistency test for the eight special cases mentioned in Sec. \ref{['sec:IMR_test_Qle4']} using the Schwarzschild ISCO as the frequency cutoff [Eq. \ref{['eq:Sch_ISCO_freq']}].
  • ...and 7 more figures