Modeling Non-Gaussianities in Pulsar Timing Array data analysis using Gaussian Mixture Models
Mikel Falxa, Alberto Sesana
TL;DR
This paper tackles the problem that PTA noise and signals may deviate from Gaussianity, particularly for a SMBHB-induced GWB. It introduces Gaussian Mixture Models on Fourier coefficients to capture non-Gaussian statistics while preserving analytical marginalization, enabling posterior inference of higher-order moments and integration into spectral refitting. Through simulations of single-pulsar noise, common red noise, and a Gaussian background plus loud CGWs, the approach demonstrates Bayes-factor–driven detection of non-Gaussian features and reveals how deterministic sources can suppress apparent non-Gaussianity if properly modeled. In more realistic SMBHB population scenarios, non-Gaussianities become subtler and may require larger PTAs to detect, but the framework provides a principled path to exploiting higher-order statistics to characterize GW source populations and outliers within a stochastic background.
Abstract
In Pulsar Timing Array (PTA) data analysis, noise is typically assumed to be Gaussian, and the marginalized likelihood has a well-established analytical form derived within the framework of Gaussian processes. However, this Gaussianity assumption may break down for certain classes of astrophysical and cosmological signals, particularly for a gravitational wave background (GWB) generated by a population of supermassive black hole binaries (SMBHBs). In this work, we present a new method for testing the presence of non-Gaussian features in PTA data. We go beyond the Gaussian assumption by modeling the noise or signal statistics using a Gaussian mixture model (GMM). An advantage of this approach is that the marginalization of the likelihood remains fully analytical, expressed as a linear combination of Gaussian PTA likelihoods. This makes the method straightforward to implement within existing data analysis tools. Moreover, this method extends beyond the free spectrum analysis by producing posterior probability distributions of higher-order moments inferred from the data, which can be incorporated into spectral refitting techniques. We validate the model using simulations and demonstrate the sensitivity of PTAs to non-Gaussianity by computing the Bayes factor in favor of the GMM as a function of the injected excess moments. We apply the method to a more astrophysically motivated scenario where a single SMBHB is resolved on top of a Gaussian GWB and show that significant non-Gaussianities are introduced by the individual source. Finally, we test our model on a realistic GWB generated from a simulated population of SMBHBs.
