Table of Contents
Fetching ...

Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent System

Haoyang Su, Renqi Chen, Shixiang Tang, Zhenfei Yin, Xinzhe Zheng, Jinzhe Li, Biqing Qi, Qi Wu, Hui Li, Wanli Ouyang, Philip Torr, Bowen Zhou, Nanqing Dong

TL;DR

This paper introduces VirSci, an LLM-based multi-agent system that simulates the collaborative dynamics of scientific research within a digital twin Scientific Research Ecosystem. It defines a five-step workflow—Collaborator Selection, Topic Discussion, Idea Generation, Novelty Assessment, and Abstract Generation—and employs retrieval-augmented generation and an invitation mechanism to enable inter- and intra-team collaboration grounded in real-world data. Through extensive experiments on large bibliographic datasets, VirSci outperforms single-agent baselines and validates its effectiveness via objective novelty metrics and human assessments, while offering insights into how team size, turnover, freshness, and diversity influence innovation. The work advances autonomous scientific discovery by providing an ecosystem-backed framework and actionable guidance for designing collaborative AI systems in science.

Abstract

The rapid advancement of scientific progress requires innovative tools that can accelerate knowledge discovery. Although recent AI methods, particularly large language models (LLMs), have shown promise in tasks such as hypothesis generation and experimental design, they fall short of replicating the collaborative nature of real-world scientific practices, where diverse experts work together in teams to tackle complex problems. To address the limitations, we propose an LLM-based multi-agent system, i.e., Virtual Scientists (VirSci), designed to mimic the teamwork inherent in scientific research. VirSci organizes a team of agents to collaboratively generate, evaluate, and refine research ideas. Through comprehensive experiments, we demonstrate that this multi-agent approach outperforms the state-of-the-art method in producing novel scientific ideas. We further investigate the collaboration mechanisms that contribute to its tendency to produce ideas with higher novelty, offering valuable insights to guide future research and illuminating pathways toward building a robust system for autonomous scientific discovery. The code is available at https://github.com/open-sciencelab/Virtual-Scientists.

Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent System

TL;DR

This paper introduces VirSci, an LLM-based multi-agent system that simulates the collaborative dynamics of scientific research within a digital twin Scientific Research Ecosystem. It defines a five-step workflow—Collaborator Selection, Topic Discussion, Idea Generation, Novelty Assessment, and Abstract Generation—and employs retrieval-augmented generation and an invitation mechanism to enable inter- and intra-team collaboration grounded in real-world data. Through extensive experiments on large bibliographic datasets, VirSci outperforms single-agent baselines and validates its effectiveness via objective novelty metrics and human assessments, while offering insights into how team size, turnover, freshness, and diversity influence innovation. The work advances autonomous scientific discovery by providing an ecosystem-backed framework and actionable guidance for designing collaborative AI systems in science.

Abstract

The rapid advancement of scientific progress requires innovative tools that can accelerate knowledge discovery. Although recent AI methods, particularly large language models (LLMs), have shown promise in tasks such as hypothesis generation and experimental design, they fall short of replicating the collaborative nature of real-world scientific practices, where diverse experts work together in teams to tackle complex problems. To address the limitations, we propose an LLM-based multi-agent system, i.e., Virtual Scientists (VirSci), designed to mimic the teamwork inherent in scientific research. VirSci organizes a team of agents to collaboratively generate, evaluate, and refine research ideas. Through comprehensive experiments, we demonstrate that this multi-agent approach outperforms the state-of-the-art method in producing novel scientific ideas. We further investigate the collaboration mechanisms that contribute to its tendency to produce ideas with higher novelty, offering valuable insights to guide future research and illuminating pathways toward building a robust system for autonomous scientific discovery. The code is available at https://github.com/open-sciencelab/Virtual-Scientists.

Paper Structure

This paper contains 61 sections, 7 equations, 35 figures, 9 tables.

Figures (35)

  • Figure 1: The proposed LLM-based multi-agent system, VirSci, includes five key steps: Collaborator Selection, where a research team is assembled; Topic Discussion, where the research topic is determined; Idea Generation, where team members propose and refine ideas; Novelty Assessment, where ideas are evaluated and voted on to select the best one; and Abstract Generation, where the selected idea is developed into a complete abstract.
  • Figure 2: Key components of the proposed system. The left section illustrates the collaborator selection process, where the team leader forms a research team. The middle section highlights the discussion routine, a fundamental part of every step in the system, where the team engages in collaborative dialogue to progress through tasks. The right section depicts the architecture of the author knowledge bank and paper database, which provide critical information used throughout the collaboration process.
  • Figure 3: Evaluation of abstracts using our overall novelty metric and human evaluation. The Pearson correlation coefficient of 0.52 indicates a positive correlation.
  • Figure 4: Effects of team size and discussion turn on novelty. Peak occurs with 8 members and 5 turns, while larger teams or excessive turns hinder creativity. "Inference Cost" is the product of team size and turns.
  • Figure 5: The balance of new and returning collaborators in the team has a notable impact on novelty, with 50% freshness yielding the highest historical dissimilarity and overall novelty, particularly in larger teams.
  • ...and 30 more figures