Table of Contents
Fetching ...

Empowering Biomedical Discovery with AI Agents

Shanghua Gao, Ada Fang, Yepeng Huang, Valentina Giunchiglia, Ayush Noori, Jonathan Richard Schwarz, Yasha Ektefaie, Jovana Kondic, Marinka Zitnik

TL;DR

The paper envisions biomedical AI agents as collaborative, skeptical researchers—composed of LLMs, ML tools, experimental platforms, and human input—to accelerate discovery by decomposing complex problems into subtasks and continually updating knowledge. It presents a compound AI framework with perception, memory, interaction, and reasoning modules, and a taxonomy of multi-agent collaboration schemes and autonomy levels (0–3) across genetics, cell biology, and chemical biology. It discusses practical challenges—robustness, uncertainty, evaluation, data governance, and safety—alongside a roadmap for building capable agents, including governance and responsible deployment. The work highlights potential impacts such as virtual cell simulations, programmable phenotypic control, cellular circuit design, and novel therapies, while emphasizing the need for data availability, standardization, and ethical guidelines. Overall, it lays out a structured vision for integrating diverse AI capabilities into biomedical discovery to enhance efficiency, scale, and creativity with careful oversight.

Abstract

We envision "AI scientists" as systems capable of skeptical learning and reasoning that empower biomedical research through collaborative agents that integrate AI models and biomedical tools with experimental platforms. Rather than taking humans out of the discovery process, biomedical AI agents combine human creativity and expertise with AI's ability to analyze large datasets, navigate hypothesis spaces, and execute repetitive tasks. AI agents are poised to be proficient in various tasks, planning discovery workflows and performing self-assessment to identify and mitigate gaps in their knowledge. These agents use large language models and generative models to feature structured memory for continual learning and use machine learning tools to incorporate scientific knowledge, biological principles, and theories. AI agents can impact areas ranging from virtual cell simulation, programmable control of phenotypes, and the design of cellular circuits to developing new therapies.

Empowering Biomedical Discovery with AI Agents

TL;DR

The paper envisions biomedical AI agents as collaborative, skeptical researchers—composed of LLMs, ML tools, experimental platforms, and human input—to accelerate discovery by decomposing complex problems into subtasks and continually updating knowledge. It presents a compound AI framework with perception, memory, interaction, and reasoning modules, and a taxonomy of multi-agent collaboration schemes and autonomy levels (0–3) across genetics, cell biology, and chemical biology. It discusses practical challenges—robustness, uncertainty, evaluation, data governance, and safety—alongside a roadmap for building capable agents, including governance and responsible deployment. The work highlights potential impacts such as virtual cell simulations, programmable phenotypic control, cellular circuit design, and novel therapies, while emphasizing the need for data availability, standardization, and ethical guidelines. Overall, it lays out a structured vision for integrating diverse AI capabilities into biomedical discovery to enhance efficiency, scale, and creativity with careful oversight.

Abstract

We envision "AI scientists" as systems capable of skeptical learning and reasoning that empower biomedical research through collaborative agents that integrate AI models and biomedical tools with experimental platforms. Rather than taking humans out of the discovery process, biomedical AI agents combine human creativity and expertise with AI's ability to analyze large datasets, navigate hypothesis spaces, and execute repetitive tasks. AI agents are poised to be proficient in various tasks, planning discovery workflows and performing self-assessment to identify and mitigate gaps in their knowledge. These agents use large language models and generative models to feature structured memory for continual learning and use machine learning tools to incorporate scientific knowledge, biological principles, and theories. AI agents can impact areas ranging from virtual cell simulation, programmable control of phenotypes, and the design of cellular circuits to developing new therapies.
Paper Structure (3 sections, 6 figures, 2 tables)

This paper contains 3 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Empowering biomedical research with AI agents. AI agents pave the way for "AI scientists" capable of skeptical learning and reasoning. These multi-agent systems consist of agents based on conversable large language models (LLMs) and can coordinate machine learning (ML) tools, experimental platforms, humans, or even combinations of them. Robotic agent, AI agent that operates robotic hardware for physical experiments; Database agent, AI agent that can information in databases via 'function calling' and APIs; Reasoning agent, AI agent capable of direct reasoning and reasoning with feedback; Hypothesis agent, AI agent that is creative and reflective when developing hypotheses, capable of characterizing its own uncertainty and using that as a driver to refine its scientific knowledge bases; Brainstorming agent, AI agent that generates a broad spectrum of research ideas; Search engine agent, AI agent that uses search engines as tools to rapidly gather information; Analysis agent, AI agent capable of analyzing experimental results to summarize findings and synthesize concepts; Experimental planning agent, AI agent that optimizes an experimental protocol for execution.
  • Figure 2: Evolving use of data-driven models in research. Data-driven approaches, from databases and search engines, machine learning, and interactive learning models to advanced agent systems (Section \ref{['sec:why']}), have reshaped biomedical research throughout the last several decades. Dashed boxes represent studies focused predominantly on algorithmic machine learning innovation; solid-line boxes represent studies focused predominantly on biomedical discovery.
  • Figure 3: Diverse configurations of AI agents in biomedicine -- from an LLM-based AI agent to a multi-agent system with AI models, tools, and integrated physical devices.a. By programming an LLM with the role, one LLM-based agent, equipped with memory and reasoning abilities, performs multi-modal perception and utilizes a range of tools, e.g., web lab tools, to accomplish specified tasks. b-e. Leveraging AI agents equipped with diverse roles, perception modules, tools, and domain knowledge enables collaboration between agents and scientists. This collaboration can adopt various schemes, such as expert consultation, debate, brainstorming, and round table discussions. f. Multi-agent systems can establish a self-driving laboratory wherein numerous agents collaborate on multiple iterations of biological research assisted by humans. Each cycle of research encompasses the generation of hypotheses, the design of experiments, the execution of experiments both in silico and in vitro, and the analysis of results. Computing agent, AI agent that utilizes computational models as tools; Decision agent, AI agent that makes decisions in response to given conditions; Database agent, AI agent that retrieves relevant information from databases; Reasoning agent, AI agent capable of direct reasoning and reasoning with feedback; Expert agent, AI agent that provides professional consultation based on reliable sources, such as domain expertise, feedback from human experts, and the results of specific tools. Hypothesis agent, AI agent capable of reflective learning and reasoning to generate hypotheses; Planner agent, AI agent that devises plans for future actions; In silico/vitro agent, AI agent that uses tools in silico or in vitro environment.
  • Figure 4: Four key modules of biomedical AI agents: perception, interaction, reasoning, and memory modules. Perception interprets multi-modal environmental data. Interaction facilitates engagement with the environment, encompassing human-agent interactions, multi-agent interactions, and tool use. Memory is responsible for the storage and retrieval of knowledge, while Learning focuses on the acquisition and updating of knowledge. Reasoning, with or without environmental feedback, plays a crucial role in planning and decision-making processes. Cross-modal alignment is a key technique for the perception of LLM-based agents, where inputs from different modalities are aligned within a text-centered representation space. This alignment enables the LLM to perceive and process various input modalities. Reasoning patterns for AI agents indicate transitions between reasoning thoughts. For instance, agents with a chain of thought pattern generate reasoning in a step-by-step manner.
  • Figure 5: Illustration of components in biomedical AI agents.a. Use of a short-term memory module to recall previous relevant experiments for small molecule inhibitor design. b. Use of a long-term memory module to retrieve relevant information for target selection for a disease. c. Use of reasoning without scientist feedback in gene prioritization for phenotype analysis. d. Use of reasoning with feedback from scientists to select an alternative experimental approach. e. Combining perception, interaction, memory, and reasoning modules to study the selection against pathogenic mitochondria DNA in the germline.
  • ...and 1 more figures