Table of Contents
Fetching ...

Why Estimating $η_\oplus$ is Difficult: A Kepler-Centric Perspective

Steve Bryson, Michelle Kunimoto, Ruslan Belikov, Galen J. Bergsten, Sakhee Bhure, William J. Borucki, Douglas A. Caldwell, Aritra Chakrabarty, Rachel B. Fernandes, Matthias Y. He, Jon M. Jenkins, Kristo Ment, Michael R. Meyer, Gijs D. Mulders, Ilaria Pascucci, Peter Plavchan

TL;DR

This paper examines why estimates of $η_\oplus$ from Kepler data diverge, distinguishing easy sources of variation (definitions, data choices, and catalogs) from hard, intrinsic data limitations (scarcity of detections and the need to extrapolate into the habitable zone). It evaluates how DR25 improved catalog completeness and reliability but highlights that large portions of the habitable zone remain beyond measured completeness, necessitating extrapolation and model assumptions that can bias results. The authors argue for improved data processing, refined input catalogs (e.g., Gaia-based stellar properties), and new observations from upcoming missions (PLATO, Earth 2.0, Roman) to anchor estimates of Earth-like planet occurrence. They also discuss strategies to leverage Kepler data more effectively, including pixel-level vetting and higher-fidelity completeness/reliability assessments. Overall, the study clarifies the limitations of Kepler for $η_⊕$ and points toward an integrated, data-rich approach involving future observatories and deeper Kepler reanalysis to reduce biases and uncertainties.

Abstract

$η_{\oplus}$, the occurrence rate of rocky habitable zone exoplanets orbiting Sun-like stars, is of great interest to both the astronomical community and the general public. The Kepler space telescope has made it possible to estimate $η_{\oplus}$, but estimates by different groups vary by more than an order of magnitude. We identify several causes for this range of estimates. We first review why, despite being designed to estimate $η_{\oplus}$, Kepler's observations are not sufficient for a high-confidence estimate, due to Kepler's detection limit coinciding with the $η_{\oplus}$ regime. This results in a need to infer $η_{\oplus}$, for example extrapolating from a regime of non-habitable zone, non-rocky exoplanets. We examine two broad classes of causes that can account for the large discrepancy in $η_\oplus$ found in the literature: a) differences in definitions and input data between studies, and b) fundamental limits in Kepler data that lead to large uncertainties and poor accuracy. We highlight the risk of large biases when using extrapolation to describe small exoplanet populations in the habitable zone. We discuss how $η_{\oplus}$ estimates based on Kepler data can be improved, such as reprocessing Kepler data for more complete, higher-reliability detections and better exoplanet catalog characterization. We briefly survey upcoming space telescopes capable of measuring $η_{\oplus}$, and how they can be used to supplement Kepler data.

Why Estimating $η_\oplus$ is Difficult: A Kepler-Centric Perspective

TL;DR

This paper examines why estimates of from Kepler data diverge, distinguishing easy sources of variation (definitions, data choices, and catalogs) from hard, intrinsic data limitations (scarcity of detections and the need to extrapolate into the habitable zone). It evaluates how DR25 improved catalog completeness and reliability but highlights that large portions of the habitable zone remain beyond measured completeness, necessitating extrapolation and model assumptions that can bias results. The authors argue for improved data processing, refined input catalogs (e.g., Gaia-based stellar properties), and new observations from upcoming missions (PLATO, Earth 2.0, Roman) to anchor estimates of Earth-like planet occurrence. They also discuss strategies to leverage Kepler data more effectively, including pixel-level vetting and higher-fidelity completeness/reliability assessments. Overall, the study clarifies the limitations of Kepler for and points toward an integrated, data-rich approach involving future observatories and deeper Kepler reanalysis to reduce biases and uncertainties.

Abstract

, the occurrence rate of rocky habitable zone exoplanets orbiting Sun-like stars, is of great interest to both the astronomical community and the general public. The Kepler space telescope has made it possible to estimate , but estimates by different groups vary by more than an order of magnitude. We identify several causes for this range of estimates. We first review why, despite being designed to estimate , Kepler's observations are not sufficient for a high-confidence estimate, due to Kepler's detection limit coinciding with the regime. This results in a need to infer , for example extrapolating from a regime of non-habitable zone, non-rocky exoplanets. We examine two broad classes of causes that can account for the large discrepancy in found in the literature: a) differences in definitions and input data between studies, and b) fundamental limits in Kepler data that lead to large uncertainties and poor accuracy. We highlight the risk of large biases when using extrapolation to describe small exoplanet populations in the habitable zone. We discuss how estimates based on Kepler data can be improved, such as reprocessing Kepler data for more complete, higher-reliability detections and better exoplanet catalog characterization. We briefly survey upcoming space telescopes capable of measuring , and how they can be used to supplement Kepler data.

Paper Structure

This paper contains 25 sections, 9 figures, 1 table.

Figures (9)

  • Figure 1: A comparison of representative $\eta_\oplus$ estimates (Petigura2013ForemanMackey2014Silburt2015Burke2015, SAG13, Mulders2018Hsu2019Pascucci2019Bryson2020Kunimoto2020aBryson2021Bergsten2022), with the definition of the $\eta_\oplus$ regime shown on the right. Bryson2021 integrated over the optimistic habitable zone Kopparapu2013 based on individual stars' stellar effective temperature ($237 - 860$ days for a Sun-like star), while all other works integrated over a single period or instellation flux range for all stars. We show the Bryson2021 estimates for two bounding completeness extrapolations. The bracket labeled "DR25 and Gaia" shows the studies using the DR25 exoplanet candidate catalog and Gaia stellar properties as discussed in §\ref{['section:catalogs']}. The bracket labeled "Reliability" shows the studies that applied reliability correction, discussed in §\ref{['section:reliability']}.
  • Figure 2: The period distribution of transit detections (blue) and exoplanet candidates (orange) in the Kepler DR25 catalog. The vertical grey line at 372.5 days shows the orbital period of the Kepler telescope. The large spike in detections near the Kepler orbital period is due to instrumental false alarms.
  • Figure 3: Top: the Kepler exoplanet candidate population around FGK stars used in the analysis of Bryson2021, shown in period and radius, both colored and sized by catalog reliability with exoplanet radius error bars. The background color map and contours indicate detection completeness. The rectangles show example $\eta_\oplus$ definitions from different authors. Adapted from Bryson2020. Bottom: The same exoplanet population shown in period and host star effective temperature, which allows comparison with and approximation of the habitable zone. The green lines show period range of the optimistic (solid) and conservative (dashed) habitable zones for average dwarf stars given effective temperature. The period beyond which there is no completeness data and the longest detectable period are shown by vertical lines. The $\oplus$ symbol shows the Earth's orbital period at the Sun's effective temperature. Reliability values in both panels are from Bryson2020.
  • Figure 4: Two views of the DR25 PC population around FGK stars with radii smaller than $2.5~R_\oplus$ and instellation flux near their host star's habitable zone around main sequence dwarf stars. Top: Instellation flux vs. stellar effective temperature, showing the habitable zone and Kepler observational coverage. The background color map gives, at each point, the fraction of Kepler target stars at that effective temperature whose exoplanets at that instellation flux would have orbital periods of 710 days or less, so it is possible to observe three transits. The contours show the fraction of exoplanets with periods of 500 days or less, indicating available completeness measurements. The solid green lines are the boundaries of the optimistic habitable zone, while the dashed green lines are the boundaries of the conservative habitable zone. The exoplanets are sized by their radius and colored by their catalog reliability. Bottom: Instellation flux vs. exoplanet radius. The color map and contours show the average completeness for the stellar population (in the case of zero-completeness extrapolation, see §\ref{['section:completeness']}). The exoplanets are sized and colored by catalog reliability, with radius and instellation flux error bars. In the lower panel the $\oplus$ symbol shows the Earth. The two numbered exoplanets are discussed in §\ref{['section:needExtrapolate']}. From Bryson2021. Reliability values in both panels are from Bryson2020.
  • Figure 5: The habitable zone flux range compared with example orbital periods, previously used to estimate habitable zone occurrence for F, G, and K stars. For each star in the stellar parent sample, we show the SAG13 instellation flux range of the SAG13 orbital period range as a horizontal grey line. The solid green lines are the boundaries of the optimistic habitable zone, while the dashed green lines are the boundaries of the conservative habitable zone. The exoplanet population is the same as in Figure \ref{['figure:populations']}, and exoplanets are sized by their radius. Adapted from Bryson2021.
  • ...and 4 more figures