Molecular diversity as a biosignature

Gideon Yoffe; Fabian Klenner; Barak Sober; Yohai Kaspi; Itay Halevy

Molecular diversity as a biosignature

Gideon Yoffe, Fabian Klenner, Barak Sober, Yohai Kaspi, Itay Halevy

TL;DR

This work introduces a new class of biosignatures, defined by the statistical organization of molecular assemblages and quantified using ecodiversity metrics, and finds that biotic samples are consistently more diverse, and therefore distinct, from their sparser abiotic counterparts.

Abstract

The search for life in the Solar System hinges on data from planetary missions. Biosignatures based on molecular identity, isotopic composition, or chiral excess require measurements that current and planned missions cannot provide. We introduce a new class of biosignatures, defined by the statistical organization of molecular assemblages and quantified using ecodiversity metrics. Using this framework, we analyze amino acid diversity across a dataset spanning terrestrial and extraterrestrial contexts. We find that biotic samples are consistently more diverse, and therefore distinct, from their sparser abiotic counterparts. This distinction holds for fatty acids as well, indicating that the diversity signal reflects a fundamental biosynthetic signature. It also proves persistent under space-like degradation. Relying only on relative abundances, this biogenicity assessment strategy is applicable to any molecular composition data from archived, current, and planned planetary missions. By capturing a fundamental statistical property of life's chemical organization, it may also transcend biosignatures that are contingent on Earth's evolutionary history.

Molecular diversity as a biosignature

TL;DR

Abstract

Paper Structure

This paper contains 17 sections, 12 equations, 18 figures.

Figures (18)

Figure 1: Illustrative evenness curves and abundance profiles. Top: Evenness curves for three assemblages of ten species, distributed uniformly (light gray), unevenly (dark gray), and sparsely (black), computed across diversity orders ($q$). Flatter curves indicate more evenly distributed species. Bottom: Corresponding abundance distributions for each assemblage. Species are denoted by letters.
Figure 2: Dissimilarity analysis of evenness curves of amino acid assemblages. (a): Multidimensional Scaling (MDS) projection of dissimilarities between evenness curves, $E(q)$. Points represent samples; distances between samples grow with dissimilarity. Edges connect samples to the 25th percentile of their nearest neighbors. Markers indicate inferred origin: biotic (green hexagons), abiotic (pink circles), and mixed (blue diamonds). (b): Evenness curve distributions of four distinct sample groups. Solid lines represent mean values, and filled areas represent the one standard deviation interval. (c): Predictive power of a sample's origin through $k$-Nearest-Neighbors ($k$NN) classification, applied to pairwise distances between samples projected onto two MDS axes. Accuracy is quantified using a normalized Matthews Correlation Coefficient (MCC), where 50% corresponds to random assignment and 100% indicates perfect classification. Uncertainty was estimated using multiple initializations of the MDS projection.
Figure 3: One-dimensional MDS projection of dissimilarities between evenness curves of samples of biotic and mixed inferred origins.
Figure 4: Diversity analysis of fatty acids. (a) One-dimensional MDS projection of dissimilarities between samples. (b) Evenness curves of the abiotic samples and the biotic sample group. (c) Dissimilarity $Z$-score matrix (a similar matrix for amino acids can be found in Methods). Row and column numbers correspond to the numbering of samples in (a).
Figure 5: Dissimilarity of the diversity signal for simulated biotic profiles of glycine, alanine, and phenylalanine in near-surface ice on Europa’s leading hemisphere (60$^\circ$ latitude), under radiolytic degradation at different depths (10 to 500 millimeters). Each curve is a dissimilarity $Z$-score between the degrading biotic profile and three benchmarks: the original biotic profile (gray), the pristine abiotic reference (black), and a simultaneously degrading abiotic profile (slate gray). Its sharp fluctuations and eventual collapse to a dissimilarity of 0 are due to the degraded profiles becoming too sparse to evaluate. Shaded regions denote one standard deviation across a distribution of biotic evenness curves. A 3$\sigma$ dissimilarity threshold is marked with a dashed gray line.
...and 13 more figures