Table of Contents
Fetching ...

FrankenStat I: a New Approach to Pulsar Timing Array Data Combination

David Wright, Kalista Wayt, Jeffrey S. Hazboun, Xavier Siemens, Rutger van Haasteren, Levi Schult, Stephen R. Taylor

TL;DR

The paper introduces FrankenStat, a data-combination framework for Pulsar Timing Array analyses that concatenates residuals from multiple datasets without forming a single merged timing model. By building a concatenated Fourier design matrix and block-diagonal TM and noise structures, FrankenStat yields FrankenPulsars that enable PTA analyses with nearly identical sensitivity to traditional combined datasets while dramatically reducing computational time to minutes. Through extensive simulations, FrankenStat demonstrates equivalent MAP GWB parameter recovery and near-identical SNR and p-value distributions compared to fully merged analyses, indicating substantial practical benefits for IPTA-scale data processing. The approach, implemented in accessible software (GitHub and MetaPulsar), promises faster, scalable PTA science and can be extended to incorporate cross-dataset correlations via covariance methods.

Abstract

In 2023, after more than two decades of searching, pulsar timing array (PTA) collaborations around the world announced evidence for a stochastic gravitational wave background. It was quickly followed by work from the International Pulsar Timing Array (IPTA), demonstrating that the results of regional collaborations were consistent with each other. The combination of these datasets is still ongoing and represents a significant investment of time and expertise. In that IPTA comparison, authors of this letter combined the separate datasets in the standard PTA optimal detection statistic for cross-correlations incoherently, that is, the data was combined without fitting a merged timing model across all PTA datasets, treating datasets of the same pulsar as independent, and neglecting the "same pulsar, different datasets" cross-correlations. This work refines that method by extending its core ideas beyond detection statistics and into a full, general data-combination method. We have demonstrated its efficacy and extreme efficiency on simulated data. This new method, \textit{FrankenStat}, is very similar in sensitivity and parameter-constraining power to traditional data combination methods while completing the full data combination in just a few minutes.

FrankenStat I: a New Approach to Pulsar Timing Array Data Combination

TL;DR

The paper introduces FrankenStat, a data-combination framework for Pulsar Timing Array analyses that concatenates residuals from multiple datasets without forming a single merged timing model. By building a concatenated Fourier design matrix and block-diagonal TM and noise structures, FrankenStat yields FrankenPulsars that enable PTA analyses with nearly identical sensitivity to traditional combined datasets while dramatically reducing computational time to minutes. Through extensive simulations, FrankenStat demonstrates equivalent MAP GWB parameter recovery and near-identical SNR and p-value distributions compared to fully merged analyses, indicating substantial practical benefits for IPTA-scale data processing. The approach, implemented in accessible software (GitHub and MetaPulsar), promises faster, scalable PTA science and can be extended to incorporate cross-dataset correlations via covariance methods.

Abstract

In 2023, after more than two decades of searching, pulsar timing array (PTA) collaborations around the world announced evidence for a stochastic gravitational wave background. It was quickly followed by work from the International Pulsar Timing Array (IPTA), demonstrating that the results of regional collaborations were consistent with each other. The combination of these datasets is still ongoing and represents a significant investment of time and expertise. In that IPTA comparison, authors of this letter combined the separate datasets in the standard PTA optimal detection statistic for cross-correlations incoherently, that is, the data was combined without fitting a merged timing model across all PTA datasets, treating datasets of the same pulsar as independent, and neglecting the "same pulsar, different datasets" cross-correlations. This work refines that method by extending its core ideas beyond detection statistics and into a full, general data-combination method. We have demonstrated its efficacy and extreme efficiency on simulated data. This new method, \textit{FrankenStat}, is very similar in sensitivity and parameter-constraining power to traditional data combination methods while completing the full data combination in just a few minutes.

Paper Structure

This paper contains 9 sections, 1 theorem, 22 equations, 4 figures, 1 table.

Key Result

Lemma 1

The trace of a projection matrix times a positive semidefinite matrix is greater than or equal to zero.

Figures (4)

  • Figure 1: Flow chart detailing pipeline. First, a baseline, fully "Combined" PTA is simulated. Then, a single pulsar noise analysis (SPNA) is ran on each pulsar using gradient descent to find maximum a posteriori (MAP) noise parameters, and the timing model (TM) is refit. To construct the split PTAs that we will later recombine with FrankenStat, we take every third TOA from the original data to create three new PTA datasets. We then run an SPNA and TM refit on every pulsar in these new, split datasets. Once the TMs are refit, we can "Frankenize" the pulsars to create FrankenPulsars. The FrankenPulsars should give us noise estimates that are more accurate than the individual PTAs and comparable to the Combined pulsars. To benefit from this increased sensitivity, we run SPNAs on the FrankenStat pulsars as well. In order to calculate detection statistics, we also need estimates of the GWB. We find a MAP estimate using a CURN likelihood for the Combined PTA, each split PTA, and the FrankenStat PTA. These MAP parameters are then used to find the SNR and corresponding $p$-value from the null distribution.
  • Figure 2: Difference between Combined and FrankenStat CURN maximum likelihood estimates for one hundred simulations. The estimates are in close agreement, with $\mu \pm \sigma$ of $-0.03 \pm 0.19$ and $0.05 \pm 0.28$ for $\log_{10}\ab(A)$ and $\gamma$, respectively.
  • Figure 3: Distribution of SNRs and $p$-values corresponding to the maximum likelihood parameters in Figure \ref{['fig:curn-max-like']}. Dotted lines represent 1D kernel density estimators, and black dashed lines are the means of the distributions. The FrankenStat distribution show excellent agreement with the Combined data. Both FrankenStat and the traditionally-combined data show a marked improvement in SNR and detection significance over the individual datasets.
  • Figure 4: PTA sensitivity curves from a representative simulation of the Combined and FrankenStat data. In this plot, we show two types of sensitivity curve: one including the effects of the timing model fit, intrinsic red noise, white noise (WN), and GWB self-noise ("Full Noise"), and one with only timing model effects and WN ("Only TM"). For the sensitivity curve with TM + WN, the differences between the Combined and FrankenStat curves are due only to the different TMs, since the WN is kept constant. These sensitivity curves are nearly identical and show at most a 0.05% difference. In this simulation, the difference is always positive, which means that FrankenStat is slightly less sensitive but negligibly so when only accounting for the TM. When including the noise, the differences become only slightly larger and peak at about $1\%$. These differences are now mostly due to the separate noise parameters that come from the Combined and FrankenStat MAP noise and MAP CURN estimates.

Theorems & Definitions (2)

  • Lemma 1
  • proof