Table of Contents
Fetching ...

Autonomous Agents for Scientific Discovery: Orchestrating Scientists, Language, Code, and Physics

Lianhao Zhou, Hongyi Ling, Cong Fu, Yepeng Huang, Michael Sun, Wendi Yu, Xiaoxuan Wang, Xiner Li, Xingyu Su, Junkai Zhang, Xiusi Chen, Chenxing Liang, Xiaofeng Qian, Heng Ji, Wei Wang, Marinka Zitnik, Shuiwang Ji

TL;DR

The paper argues that large language model–based autonomous agents can orchestrate scientists, language, code, and physics to accelerate the scientific discovery lifecycle—from hypothesis discovery through experimental design and execution to result analysis and refinement. It introduces an information-theoretic framework centered on entropy, verifiability, and dissipation, and proposes a five-level autonomy ladder to measure agent capability across discovery phases. It surveys knowledge extraction, hypothesis generation, experimental design/execution, and result analysis, and discusses tool use and tool creation as core operational modes, including domain-specific agents and multi-agent collaboration. The work also addresses challenges in agentic reinforcement learning, environment interaction with physical tools, and the role of serendipity, offering directions for building more robust, generalizable, and adaptive scientific agents with broad practical impact across disciplines.

Abstract

Computing has long served as a cornerstone of scientific discovery. Recently, a paradigm shift has emerged with the rise of large language models (LLMs), introducing autonomous systems, referred to as agents, that accelerate discovery across varying levels of autonomy. These language agents provide a flexible and versatile framework that orchestrates interactions with human scientists, natural language, computer language and code, and physics. This paper presents our view and vision of LLM-based scientific agents and their growing role in transforming the scientific discovery lifecycle, from hypothesis discovery, experimental design and execution, to result analysis and refinement. We critically examine current methodologies, emphasizing key innovations, practical achievements, and outstanding limitations. Additionally, we identify open research challenges and outline promising directions for building more robust, generalizable, and adaptive scientific agents. Our analysis highlights the transformative potential of autonomous agents to accelerate scientific discovery across diverse domains.

Autonomous Agents for Scientific Discovery: Orchestrating Scientists, Language, Code, and Physics

TL;DR

The paper argues that large language model–based autonomous agents can orchestrate scientists, language, code, and physics to accelerate the scientific discovery lifecycle—from hypothesis discovery through experimental design and execution to result analysis and refinement. It introduces an information-theoretic framework centered on entropy, verifiability, and dissipation, and proposes a five-level autonomy ladder to measure agent capability across discovery phases. It surveys knowledge extraction, hypothesis generation, experimental design/execution, and result analysis, and discusses tool use and tool creation as core operational modes, including domain-specific agents and multi-agent collaboration. The work also addresses challenges in agentic reinforcement learning, environment interaction with physical tools, and the role of serendipity, offering directions for building more robust, generalizable, and adaptive scientific agents with broad practical impact across disciplines.

Abstract

Computing has long served as a cornerstone of scientific discovery. Recently, a paradigm shift has emerged with the rise of large language models (LLMs), introducing autonomous systems, referred to as agents, that accelerate discovery across varying levels of autonomy. These language agents provide a flexible and versatile framework that orchestrates interactions with human scientists, natural language, computer language and code, and physics. This paper presents our view and vision of LLM-based scientific agents and their growing role in transforming the scientific discovery lifecycle, from hypothesis discovery, experimental design and execution, to result analysis and refinement. We critically examine current methodologies, emphasizing key innovations, practical achievements, and outstanding limitations. Additionally, we identify open research challenges and outline promising directions for building more robust, generalizable, and adaptive scientific agents. Our analysis highlights the transformative potential of autonomous agents to accelerate scientific discovery across diverse domains.

Paper Structure

This paper contains 26 sections, 12 figures, 3 tables.

Figures (12)

  • Figure 1: An overview of the three-phase workflow for AI-driven scientific discovery. The process begins with Phase 1: Hypothesis Discovery, where a high-level human goal is transformed through knowledge extraction and hypothesis generation into novel, verifiable scientific questions. In Phase 2: Experimental Design & Execution, these hypotheses are translated into detailed workflows which the agent executes by either using existing tools or creating new scientific tools to generate experimental data. Finally, Phase 3: Result Analysis & Refinement involves interpreting the data and entering an iterative refinement loop to progressively improve the process and arrive at validated findings.
  • Figure 2: A comprehensive overview of the role of LLM-based agents in the scientific discovery lifecycle alongside a diverse collection of domain-specific scientific agents that organize various research systems and papers across fields such as Genomics, Protein, Medicine, Chemistry, Materials, Physics, and Others.
  • Figure 3: An information-theoretic framework for autonomous scientific discovery, illustrating the inverse relationship between Information Entropy and Verifiability.
  • Figure 4: An overview of autonomous agents for scientific discovery in which an agent orchestrates scientists, language, code, and physics. This figure illustrates the dynamic, closed-loop workflow of an LLM-based scientific agent as the coordinator orchestrating four key components, including scientists, language, code, and physics. The agent continuously interacts with human scientists, receiving goals, guidance, and feedback to direct research, and providing summaries and findings. Its interaction with language involves extracting knowledge from literature to formulate verifiable hypotheses and detailed research plans. The agent's interface with code translates high-level plans into executable programs for simulations or instrument control by integrating tool functionalities. Finally, it interacts with physics by using raw data and laws to direct physical or simulated instruments, yielding experimental results. This cycle represents an iterative and autonomous discovery cycle, bridging human intent to empirical evidence.
  • Figure 5: A heatmap representation of information analysis across autonomous scientific discovery phases.
  • ...and 7 more figures