Table of Contents
Fetching ...

Towards Scientific Intelligence: A Survey of LLM-based Scientific Agents

Shuo Ren, Pu Jian, Zhenjiang Ren, Chunlin Leng, Can Xie, Jiajun Zhang

TL;DR

This paper surveys LLM-based scientific agents, highlighting their distinction from general-purpose LLMs through domain-specific knowledge integration, tooling, and validation. It organizes the field around architecture (Planner, Memory, Tool Set), benchmarks, applications, and ethics, and discusses challenges and future directions. Key contributions include a taxonomy of planners (prompt-based, SFT-based, RL-based, process supervision), memory modalities (historical context, external KBs, intrinsic knowledge), and tool sets (APIs and simulators), along with a synthesis of benchmarks and real-world deployments across chemistry, biology, physics, astronomy, ML, and literature review. The survey also addresses ethical considerations, reproducibility, and governance to guide responsible development and deployment.

Abstract

As scientific research becomes increasingly complex, innovative tools are needed to manage vast data, facilitate interdisciplinary collaboration, and accelerate discovery. Large language models (LLMs) are now evolving into LLM-based scientific agents that automate critical tasks, ranging from hypothesis generation and experiment design to data analysis and simulation. Unlike general-purpose LLMs, these specialized agents integrate domain-specific knowledge, advanced tool sets, and robust validation mechanisms, enabling them to handle complex data types, ensure reproducibility, and drive scientific breakthroughs. This survey provides a focused review of the architectures, design, benchmarks, applications, and ethical considerations surrounding LLM-based scientific agents. We highlight why they differ from general agents and the ways in which they advance research across various scientific fields. By examining their development and challenges, this survey offers a comprehensive roadmap for researchers and practitioners to harness these agents for more efficient, reliable, and ethically sound scientific discovery.

Towards Scientific Intelligence: A Survey of LLM-based Scientific Agents

TL;DR

This paper surveys LLM-based scientific agents, highlighting their distinction from general-purpose LLMs through domain-specific knowledge integration, tooling, and validation. It organizes the field around architecture (Planner, Memory, Tool Set), benchmarks, applications, and ethics, and discusses challenges and future directions. Key contributions include a taxonomy of planners (prompt-based, SFT-based, RL-based, process supervision), memory modalities (historical context, external KBs, intrinsic knowledge), and tool sets (APIs and simulators), along with a synthesis of benchmarks and real-world deployments across chemistry, biology, physics, astronomy, ML, and literature review. The survey also addresses ethical considerations, reproducibility, and governance to guide responsible development and deployment.

Abstract

As scientific research becomes increasingly complex, innovative tools are needed to manage vast data, facilitate interdisciplinary collaboration, and accelerate discovery. Large language models (LLMs) are now evolving into LLM-based scientific agents that automate critical tasks, ranging from hypothesis generation and experiment design to data analysis and simulation. Unlike general-purpose LLMs, these specialized agents integrate domain-specific knowledge, advanced tool sets, and robust validation mechanisms, enabling them to handle complex data types, ensure reproducibility, and drive scientific breakthroughs. This survey provides a focused review of the architectures, design, benchmarks, applications, and ethical considerations surrounding LLM-based scientific agents. We highlight why they differ from general agents and the ways in which they advance research across various scientific fields. By examining their development and challenges, this survey offers a comprehensive roadmap for researchers and practitioners to harness these agents for more efficient, reliable, and ethically sound scientific discovery.

Paper Structure

This paper contains 50 sections, 2 equations, 10 figures, 6 tables.

Figures (10)

  • Figure 1: A typical architecture of LLM-based scientific agents. Note that in mainstream agent frameworks, planners are predominantly implemented based on LLMs, and their capabilities include task planning, reflection, and verification, etc. For the sake of abstraction, we represent these functions with a single planner in this architecture diagram. However, in specific implementations, different agents might be set up to accomplish distinct functions (see Section \ref{['sec:sig-mul']} for further discussion about single-agent planners vs. multi-agent planners).
  • Figure 2: Taxonomy of the planner of science agents.
  • Figure 3: The types of planner in LLM-based scientific agents. (a) Prompt based planner; (b) SFT-based planner; (c) RL-based planner; (d) Process supervision based planner.
  • Figure 4: Taxonomy of the memory mechanism of science agents.
  • Figure 5: A simple process of scientific agents using historical context (e.g.,comments provided by the Review Agent, and errors in each round of experiments).
  • ...and 5 more figures