AstroAgents: A Multi-Agent AI for Hypothesis Generation from Mass Spectrometry Data
Daniel Saeedi, Denise Buckner, Jose C. Aponte, Amirali Aghazadeh
TL;DR
AstroAgents introduces a modular, eight-agent, LLM-driven framework to generate and evaluate hypotheses from high-dimensional mass spectrometry data in astrobiology. By separating data analysis, planning, hypothesis generation across chemical classes, literature grounding, and critique, the system aims to produce novel, literature-consistent hypotheses about meteoritic organics and early solar system chemistry. In two comparative experiments, Claude Sonnet 3.5 offered higher consistency with existing literature and fewer logical errors (average $6.58 \pm 1.74$) but yielded no novel hypotheses, whereas Gemini 2.0 Flash generated more hypotheses ($101$) with greater novelty but more inconsistencies (average $5.67 \pm 0.64$); $36\%$ of Gemini’s hypotheses were plausible, of which $66\%$ were novel. These results illustrate a tradeoff between collaboration strength and contextual breadth, suggesting a path toward robust, literature-grounded hypothesis generation for meteoritic organic chemistry and origins-of-life studies.
Abstract
With upcoming sample return missions across the solar system and the increasing availability of mass spectrometry data, there is an urgent need for methods that analyze such data within the context of existing astrobiology literature and generate plausible hypotheses regarding the emergence of life on Earth. Hypothesis generation from mass spectrometry data is challenging due to factors such as environmental contaminants, the complexity of spectral peaks, and difficulties in cross-matching these peaks with prior studies. To address these challenges, we introduce AstroAgents, a large language model-based, multi-agent AI system for hypothesis generation from mass spectrometry data. AstroAgents is structured around eight collaborative agents: a data analyst, a planner, three domain scientists, an accumulator, a literature reviewer, and a critic. The system processes mass spectrometry data alongside user-provided research papers. The data analyst interprets the data, and the planner delegates specific segments to the scientist agents for in-depth exploration. The accumulator then collects and deduplicates the generated hypotheses, and the literature reviewer identifies relevant literature using Semantic Scholar. The critic evaluates the hypotheses, offering rigorous suggestions for improvement. To assess AstroAgents, an astrobiology expert evaluated the novelty and plausibility of more than a hundred hypotheses generated from data obtained from eight meteorites and ten soil samples. Of these hypotheses, 36% were identified as plausible, and among those, 66% were novel. Project website: https://astroagents.github.io/
