Table of Contents
Fetching ...

Inferring planet occurrence rates from radial velocities

J. P. Faria, J. -B. Delisle, D. Ségransan

TL;DR

The paper addresses biases in inferring planet occurrence rates from radial-velocity surveys by heterogeneously sensitive data. It proposes a Bayesian framework that combines per-star posterior samples via importance sampling to estimate the occurrence rate $f_R$ for a region $R$ in $(P,m)$ space, without relying on injection-recovery or explicit detection thresholds. The approach is validated on simulated data, showing unbiased region-wise estimates that improve in precision as more stars are included, and is implemented in the kima package with a public Python prototype. This threshold-free, reusable methodology enables efficient population inferences across RV datasets and can be extended to other detection methods and multiplicity analyses.

Abstract

We introduce a new method to infer the posterior distribution for planet occurrence rates from radial-velocity (RV) observations. The approach combines posterior samples from the analysis of individual RV datasets of several stars, using importance sampling to reweight them appropriately. This eliminates the need for injection-recovery tests to compute detection limits and avoids the explicit definition of a detection threshold. We validate the method on simulated RV datasets and show that it yields unbiased estimates of the occurrence rate in different regions, with increasing precision as more stars are included in the analysis.

Inferring planet occurrence rates from radial velocities

TL;DR

The paper addresses biases in inferring planet occurrence rates from radial-velocity surveys by heterogeneously sensitive data. It proposes a Bayesian framework that combines per-star posterior samples via importance sampling to estimate the occurrence rate for a region in space, without relying on injection-recovery or explicit detection thresholds. The approach is validated on simulated data, showing unbiased region-wise estimates that improve in precision as more stars are included, and is implemented in the kima package with a public Python prototype. This threshold-free, reusable methodology enables efficient population inferences across RV datasets and can be extended to other detection methods and multiplicity analyses.

Abstract

We introduce a new method to infer the posterior distribution for planet occurrence rates from radial-velocity (RV) observations. The approach combines posterior samples from the analysis of individual RV datasets of several stars, using importance sampling to reweight them appropriately. This eliminates the need for injection-recovery tests to compute detection limits and avoids the explicit definition of a detection threshold. We validate the method on simulated RV datasets and show that it yields unbiased estimates of the occurrence rate in different regions, with increasing precision as more stars are included in the analysis.

Paper Structure

This paper contains 14 sections, 16 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Example of two simulated datasets, containing no planet signal (left) and two planet signals (right). The top panels show the RV timeseries and random samples from the posterior predictive distribution. The bottom panels show the posterior distributions for $N_p$.
  • Figure 2: Joint posteriors for the minimum mass and orbital period, shown as hexbin density plots, for the same example datasets as in Fig. \ref{['fig:examples']}. All samples with $N_p>0$ are shown. The three regions where the occurrence rates are estimated are shown as coloured boxes and labels.
  • Figure 3: Occurrence rate estimates for the three regions $R_1$, $R_2$, $R_3$, from the analysis of 5, 25, and all 50 simulated datasets. The panels show the prior and posterior distributions (dotted and solid lines, respectively), and the true simulated values (vertical dashed lines). Each panel also shows the mean and standard deviation of the posterior distribution.
  • Figure 4: Procedure to estimate $f_0$ using random samples from the prior distribution that fall inside each region of interest.
  • Figure 5: Probabilistic graphical model for the RV analysis. An arrow between two nodes indicates the direction of conditional dependence. The circled nodes are parameters of the model, whose joint distribution is sampled. The filled node represents the observed RVs. The $t_i$ and $\sigma_i$ nodes are assumed given and thus fixed, while the $v_i$ are deterministic functions of other nodes. Variables inside boxes are repeated a given number of times. The dashed line connecting $f_R$ indicates the importance sampling scheme used to estimate the posterior for this parameter.
  • ...and 3 more figures