Why Estimating $η_\oplus$ is Difficult: A Kepler-Centric Perspective

Steve Bryson; Michelle Kunimoto; Ruslan Belikov; Galen J. Bergsten; Sakhee Bhure; William J. Borucki; Douglas A. Caldwell; Aritra Chakrabarty; Rachel B. Fernandes; Matthias Y. He; Jon M. Jenkins; Kristo Ment; Michael R. Meyer; Gijs D. Mulders; Ilaria Pascucci; Peter Plavchan

Why Estimating $η_\oplus$ is Difficult: A Kepler-Centric Perspective

Steve Bryson, Michelle Kunimoto, Ruslan Belikov, Galen J. Bergsten, Sakhee Bhure, William J. Borucki, Douglas A. Caldwell, Aritra Chakrabarty, Rachel B. Fernandes, Matthias Y. He, Jon M. Jenkins, Kristo Ment, Michael R. Meyer, Gijs D. Mulders, Ilaria Pascucci, Peter Plavchan

TL;DR

This paper examines why estimates of $η_\oplus$ from Kepler data diverge, distinguishing easy sources of variation (definitions, data choices, and catalogs) from hard, intrinsic data limitations (scarcity of detections and the need to extrapolate into the habitable zone). It evaluates how DR25 improved catalog completeness and reliability but highlights that large portions of the habitable zone remain beyond measured completeness, necessitating extrapolation and model assumptions that can bias results. The authors argue for improved data processing, refined input catalogs (e.g., Gaia-based stellar properties), and new observations from upcoming missions (PLATO, Earth 2.0, Roman) to anchor estimates of Earth-like planet occurrence. They also discuss strategies to leverage Kepler data more effectively, including pixel-level vetting and higher-fidelity completeness/reliability assessments. Overall, the study clarifies the limitations of Kepler for $η_⊕$ and points toward an integrated, data-rich approach involving future observatories and deeper Kepler reanalysis to reduce biases and uncertainties.

Abstract

$η_{\oplus}$, the occurrence rate of rocky habitable zone exoplanets orbiting Sun-like stars, is of great interest to both the astronomical community and the general public. The Kepler space telescope has made it possible to estimate $η_{\oplus}$, but estimates by different groups vary by more than an order of magnitude. We identify several causes for this range of estimates. We first review why, despite being designed to estimate $η_{\oplus}$, Kepler's observations are not sufficient for a high-confidence estimate, due to Kepler's detection limit coinciding with the $η_{\oplus}$ regime. This results in a need to infer $η_{\oplus}$, for example extrapolating from a regime of non-habitable zone, non-rocky exoplanets. We examine two broad classes of causes that can account for the large discrepancy in $η_\oplus$ found in the literature: a) differences in definitions and input data between studies, and b) fundamental limits in Kepler data that lead to large uncertainties and poor accuracy. We highlight the risk of large biases when using extrapolation to describe small exoplanet populations in the habitable zone. We discuss how $η_{\oplus}$ estimates based on Kepler data can be improved, such as reprocessing Kepler data for more complete, higher-reliability detections and better exoplanet catalog characterization. We briefly survey upcoming space telescopes capable of measuring $η_{\oplus}$, and how they can be used to supplement Kepler data.

Why Estimating $η_\oplus$ is Difficult: A Kepler-Centric Perspective

TL;DR

This paper examines why estimates of

from Kepler data diverge, distinguishing easy sources of variation (definitions, data choices, and catalogs) from hard, intrinsic data limitations (scarcity of detections and the need to extrapolate into the habitable zone). It evaluates how DR25 improved catalog completeness and reliability but highlights that large portions of the habitable zone remain beyond measured completeness, necessitating extrapolation and model assumptions that can bias results. The authors argue for improved data processing, refined input catalogs (e.g., Gaia-based stellar properties), and new observations from upcoming missions (PLATO, Earth 2.0, Roman) to anchor estimates of Earth-like planet occurrence. They also discuss strategies to leverage Kepler data more effectively, including pixel-level vetting and higher-fidelity completeness/reliability assessments. Overall, the study clarifies the limitations of Kepler for

and points toward an integrated, data-rich approach involving future observatories and deeper Kepler reanalysis to reduce biases and uncertainties.

Abstract

, the occurrence rate of rocky habitable zone exoplanets orbiting Sun-like stars, is of great interest to both the astronomical community and the general public. The Kepler space telescope has made it possible to estimate

, but estimates by different groups vary by more than an order of magnitude. We identify several causes for this range of estimates. We first review why, despite being designed to estimate

, Kepler's observations are not sufficient for a high-confidence estimate, due to Kepler's detection limit coinciding with the

regime. This results in a need to infer

, for example extrapolating from a regime of non-habitable zone, non-rocky exoplanets. We examine two broad classes of causes that can account for the large discrepancy in

found in the literature: a) differences in definitions and input data between studies, and b) fundamental limits in Kepler data that lead to large uncertainties and poor accuracy. We highlight the risk of large biases when using extrapolation to describe small exoplanet populations in the habitable zone. We discuss how

estimates based on Kepler data can be improved, such as reprocessing Kepler data for more complete, higher-reliability detections and better exoplanet catalog characterization. We briefly survey upcoming space telescopes capable of measuring

, and how they can be used to supplement Kepler data.

Why Estimating $η_\oplus$ is Difficult: A Kepler-Centric Perspective

TL;DR

Abstract

Why Estimating $η_\oplus$ is Difficult: A Kepler-Centric Perspective

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)