Table of Contents
Fetching ...

How Far Are AI Scientists from Changing the World?

Qiujie Xie, Yixuan Weng, Minjun Zhu, Fuchen Shen, Shulin Huang, Zhen Lin, Jiahui Zhou, Zilan Mao, Zijie Yang, Linyi Yang, Jian Wu, Yue Zhang

TL;DR

The paper analyzes how far AI scientists are from changing the world by proposing a four-capability framework: knowledge acquisition, idea generation, verification/falsification, and evolution. It surveys knowledge-acquisition methods from pre-LLM to LLM-era approaches, highlighting tools such as SciBERT, PaperWeaver, and RAG-based literature retrieval. It discusses idea generation and verification, showing that current systems can generate hypotheses but struggle with rigorous experimental design and execution, with evaluations revealing substantial gaps. The authors outline future directions, including dynamic planning, autonomous learning, and standardized AI-to-AI communication protocols, arguing for a path toward responsible AI scientists that can accelerate global scientific progress while mitigating risks.

Abstract

The emergence of large language models (LLMs) is propelling automated scientific discovery to the next level, with LLM-based Artificial Intelligence (AI) Scientist systems now taking the lead in scientific research. Several influential works have already appeared in the field of AI Scientist systems, with AI-generated research papers having been accepted at the ICLR 2025 workshop, suggesting that a human-level AI Scientist capable of uncovering phenomena previously unknown to humans, may soon become a reality. In this survey, we focus on the central question: How far are AI scientists from changing the world and reshaping the scientific research paradigm? To answer this question, we provide a prospect-driven review that comprehensively analyzes the current achievements of AI Scientist systems, identifying key bottlenecks and the critical components required for the emergence of a scientific agent capable of producing ground-breaking discoveries that solve grand challenges. We hope this survey will contribute to a clearer understanding of limitations of current AI Scientist systems, showing where we are, what is missing, and what the ultimate goals for scientific AI should be.

How Far Are AI Scientists from Changing the World?

TL;DR

The paper analyzes how far AI scientists are from changing the world by proposing a four-capability framework: knowledge acquisition, idea generation, verification/falsification, and evolution. It surveys knowledge-acquisition methods from pre-LLM to LLM-era approaches, highlighting tools such as SciBERT, PaperWeaver, and RAG-based literature retrieval. It discusses idea generation and verification, showing that current systems can generate hypotheses but struggle with rigorous experimental design and execution, with evaluations revealing substantial gaps. The authors outline future directions, including dynamic planning, autonomous learning, and standardized AI-to-AI communication protocols, arguing for a path toward responsible AI scientists that can accelerate global scientific progress while mitigating risks.

Abstract

The emergence of large language models (LLMs) is propelling automated scientific discovery to the next level, with LLM-based Artificial Intelligence (AI) Scientist systems now taking the lead in scientific research. Several influential works have already appeared in the field of AI Scientist systems, with AI-generated research papers having been accepted at the ICLR 2025 workshop, suggesting that a human-level AI Scientist capable of uncovering phenomena previously unknown to humans, may soon become a reality. In this survey, we focus on the central question: How far are AI scientists from changing the world and reshaping the scientific research paradigm? To answer this question, we provide a prospect-driven review that comprehensively analyzes the current achievements of AI Scientist systems, identifying key bottlenecks and the critical components required for the emergence of a scientific agent capable of producing ground-breaking discoveries that solve grand challenges. We hope this survey will contribute to a clearer understanding of limitations of current AI Scientist systems, showing where we are, what is missing, and what the ultimate goals for scientific AI should be.

Paper Structure

This paper contains 33 sections, 4 figures, 6 tables.

Figures (4)

  • Figure 1: The capability level of an AI Scientist, illustrating the progression from foundational knowledge acquisition (Level 1), through idea generation (Level 2), rigorous hypothesis verification and falsification (Level 3), to continuous evolution (Level 4). We outline the core functions for each capability level.
  • Figure 2: The current capability landscape of AI Scientist systems across four progressive levels. We summarize the current achievements for each level and highlight critical gaps before AI Scientist systems can autonomously make ground-breaking scientific discoveries.
  • Figure 3: An analysis of the number of publications in the field of AI Scientist systems on arXiv. The upper panel displays the average number of citations up to now, categorized by containing implementation details. The lower panel shows the growth in the total number of these papers with the same categorization.
  • Figure 4: Three paradigms of AI reviewer systems with increasing complexity. The process begins with (1) classification & scoring systems that provide quantitative outputs (e.g., scores or accept/reject decisions). This paradigm gradually evolves into (2) generation Systems that produce narrative review text. Recently, a more advanced paradigm employs (3) multi-agent Systems, where multiple AI agents collaborate to create a comprehensive, multi-faceted evaluation.