Table of Contents
Fetching ...

Distinguishing life from non-life via molecular frontier orbital energy gaps

José L. Ramírez-Colón, Ziqin Ni, Christopher E. Carr

Abstract

Amino acids (AAs) are a key target in the search for life beyond Earth due to their extensive role in the machinery of all known life, persistence over geologic timescales, and analytical detectability. However, AAs can also arise from abiotic processes on planets and in space. For example, material from asteroid Bennu contained 33 AAs, including 15 of the 20 proteinogenic AAs that are fundamental to life's functions. Distinguishing life from non-life based on AAs in a sample remains an unsolved problem, particularly when their isotopic and structural signatures (e.g., chirality) could be altered via physicochemical processes. Here we introduce LUMOS (Life Unveiled via Molecular Orbital Signatures), a statistical framework that distinguishes life from non-life by analyzing the distribution of abundance-weighted HOMO-LUMO gap (HLG) values of AAs within a sample. Compilation of AAs datasets from diverse environments and provenances revealed that abiotic samples display highly uniform distributions of AAs HLGs. In contrast, biotic samples show greater variance and preference towards AAs with lower HLG, likely reflecting the need for life to control when, where, and how chemical reactions occur. LUMOS achieves >95% accuracy in distinguishing biotic versus abiotic provenance across diverse environmental and extraterrestrial conditions. These results suggest that varied molecular reactivity within biochemical systems may be a universal feature of life, representing an agnostic biosignature unlinked to the specific set of AAs used by life as we know it. LUMOS is compatible with existing analytical instrumentation, applicable to returned samples or in situ analyses. Broader characterization of abiotic and biotic environments will further refine the chemical boundaries separating biotic from abiotic chemical systems.

Distinguishing life from non-life via molecular frontier orbital energy gaps

Abstract

Amino acids (AAs) are a key target in the search for life beyond Earth due to their extensive role in the machinery of all known life, persistence over geologic timescales, and analytical detectability. However, AAs can also arise from abiotic processes on planets and in space. For example, material from asteroid Bennu contained 33 AAs, including 15 of the 20 proteinogenic AAs that are fundamental to life's functions. Distinguishing life from non-life based on AAs in a sample remains an unsolved problem, particularly when their isotopic and structural signatures (e.g., chirality) could be altered via physicochemical processes. Here we introduce LUMOS (Life Unveiled via Molecular Orbital Signatures), a statistical framework that distinguishes life from non-life by analyzing the distribution of abundance-weighted HOMO-LUMO gap (HLG) values of AAs within a sample. Compilation of AAs datasets from diverse environments and provenances revealed that abiotic samples display highly uniform distributions of AAs HLGs. In contrast, biotic samples show greater variance and preference towards AAs with lower HLG, likely reflecting the need for life to control when, where, and how chemical reactions occur. LUMOS achieves >95% accuracy in distinguishing biotic versus abiotic provenance across diverse environmental and extraterrestrial conditions. These results suggest that varied molecular reactivity within biochemical systems may be a universal feature of life, representing an agnostic biosignature unlinked to the specific set of AAs used by life as we know it. LUMOS is compatible with existing analytical instrumentation, applicable to returned samples or in situ analyses. Broader characterization of abiotic and biotic environments will further refine the chemical boundaries separating biotic from abiotic chemical systems.
Paper Structure (32 sections, 5 equations, 10 figures, 1 table)

This paper contains 32 sections, 5 equations, 10 figures, 1 table.

Figures (10)

  • Figure 1: Distribution of HOMO–LUMO gaps in biotic and abiotic amino acids environments.a, Construction of an amino acid database from 102 abiotic and 87 biotic samples with abundance measurements, plus 43 simulated abiotic samples without abundance measurements. b, Quantum chemical calculations to determine the HLG for each amino acid. As an example, the frontier molecular orbitals of glycine are illustrated. The blue and yellow lobe colors correspond to the positive and negative phases of the wavefunction, respectively. The orbitals are visualized with VMDHUMP96 using results from $\omega$B97XD/def2-TZVP/PCM(Water) calculations performed with Psi4smith_psi4_2020. c, Distribution of HLG values [as determined by the $\omega$B97XD/def2-TZVP/PCM(Water) method] for amino acids across three categories: abiotic, abiotic-simulated, and biotic, with corresponding boxplots displaying the statistical distributions for each subcategory. d, Symmetric relative entropy of selected molecular descriptors highlighting separation ability in distinguishing between biotic and combined abiotic categories. e, Statistical significance matrix showing pairwise comparisons among the subcategories.
  • Figure 2: Abundance-weighted molecular descriptors improve classification of biotic and abiotic amino acid samples.a, Workflow for assessing the separation between biotic and abiotic samples using a database of amino acid abundances and a molecular properties dataset spanning 56 amino acids. Each amino acid in a sample is matched to a molecular descriptor of interest, which is then weighted by its abundance using one of three approaches: Gini coefficient, weighted mean, or weighted variance. The resulting values are used to evaluate the ability of the descriptor to distinguish between sample classes. b, Distribution of Gini coefficient (log scale), weighted mean, and weighted variance (log scale) values across samples using the HLG. Box plots above each distribution indicate the spread and central tendency of values across biotic and abiotic samples.
  • Figure 3: Performance of abundance-weighted molecular descriptors in biotic–abiotic classification. a, Descriptor performance ranked through symmetric relative entropy analysis of statistical metrics (weighted mean, weighted variance, Gini coefficient) applied to abundance-weighted descriptors. b, Two-dimensional scatter plot of weighted variance for HLG (MNDO method) versus Molecular Assembly (MA) Index (next-best non-HLG descriptor) across biotic and abiotic samples, with histogram displaying the distribution of the weighted variance (wvar) values of each samples across both molecular descriptors. c, Receiver operator characteristic (ROC) curves comparing classification performance of wvar for HLG (AUC=0.968), MAI (AUC=0.930), molar mass (AUC=0.869), and carbon number (AUC=0.863).
  • Figure 4: Overview and performance of the LUMOS framework for biogenicity assessment.a, The LUMOS (Life Unveiled using Molecular Orbital Signatures) framework integrates amino acid abundance measurements with quantum chemical descriptors to evaluate biogenicity. First, the abundance of a set of amino acids is estimated, followed by the computation of molecular descriptors (e.g., HOMO–LUMO gap). These descriptors are then weighted by abundance using statistical features such as weighted variance. The resulting value is used to assess the confidence in biogenicity, informed by additional contextual data and environmental provenance. b, Heatmap of $P(B|E)$, the confidence that a sample is biotic ($B$) given the observed weighted variance (evidence $E$) as a function of the number of amino acids measured. The prior $P(B)$ is taken as $0.01$. Color intensity reflects the biogenicity probability as determined by Bayesian inference applied to simulated distributions of biotic and abiotic samples. c, Criteria met by the LUMOS framework for distinguishing biotic from abiotic amino acid systems.
  • Figure S1: Abundance of Amino Acids Across Biotic and Abiotic Classes.a, Total concentration of amino acids (nmol$/$g) per sample across biotic and abiotic classes. b, Relationship between the number of amino acids detected and their total abundance (nmol$/$g) per sample in biotic and abiotic classes.
  • ...and 5 more figures