Table of Contents
Fetching ...

Deep Research: A Survey of Autonomous Research Agents

Wenlin Zhang, Xiaopeng Li, Yingyi Zhang, Pengyue Jia, Yichao Wang, Huifeng Guo, Yong Liu, Xiangyu Zhao

TL;DR

The paper defines the deep research paradigm, a four-stage autonomous pipeline (planning, question developing, web exploration, and report generation) that augments LLMs with goal-directed, evidence-grounded analysis. It surveys modular methods across planning (structured knowledge and learnable planning), question developing (reward-based and supervision-based), web exploration (browser-based, multimodal, and API-driven retrieval), and report generation (structure-aware generation and factual grounding). It discusses optimization strategies and benchmarks, highlighting current limitations such as factuality, multimodal reasoning, and cross-task transfer, while outlining directions like multi-tool integration and personalized, scalable training. Together, these insights chart a roadmap toward more capable, interpretable, and trustworthy autonomous deep research agents capable of producing coherent, evidence-backed scientific reports.

Abstract

The rapid advancement of large language models (LLMs) has driven the development of agentic systems capable of autonomously performing complex tasks. Despite their impressive capabilities, LLMs remain constrained by their internal knowledge boundaries. To overcome these limitations, the paradigm of deep research has been proposed, wherein agents actively engage in planning, retrieval, and synthesis to generate comprehensive and faithful analytical reports grounded in web-based evidence. In this survey, we provide a systematic overview of the deep research pipeline, which comprises four core stages: planning, question developing, web exploration, and report generation. For each stage, we analyze the key technical challenges and categorize representative methods developed to address them. Furthermore, we summarize recent advances in optimization techniques and benchmarks tailored for deep research. Finally, we discuss open challenges and promising research directions, aiming to chart a roadmap toward building more capable and trustworthy deep research agents.

Deep Research: A Survey of Autonomous Research Agents

TL;DR

The paper defines the deep research paradigm, a four-stage autonomous pipeline (planning, question developing, web exploration, and report generation) that augments LLMs with goal-directed, evidence-grounded analysis. It surveys modular methods across planning (structured knowledge and learnable planning), question developing (reward-based and supervision-based), web exploration (browser-based, multimodal, and API-driven retrieval), and report generation (structure-aware generation and factual grounding). It discusses optimization strategies and benchmarks, highlighting current limitations such as factuality, multimodal reasoning, and cross-task transfer, while outlining directions like multi-tool integration and personalized, scalable training. Together, these insights chart a roadmap toward more capable, interpretable, and trustworthy autonomous deep research agents capable of producing coherent, evidence-backed scientific reports.

Abstract

The rapid advancement of large language models (LLMs) has driven the development of agentic systems capable of autonomously performing complex tasks. Despite their impressive capabilities, LLMs remain constrained by their internal knowledge boundaries. To overcome these limitations, the paradigm of deep research has been proposed, wherein agents actively engage in planning, retrieval, and synthesis to generate comprehensive and faithful analytical reports grounded in web-based evidence. In this survey, we provide a systematic overview of the deep research pipeline, which comprises four core stages: planning, question developing, web exploration, and report generation. For each stage, we analyze the key technical challenges and categorize representative methods developed to address them. Furthermore, we summarize recent advances in optimization techniques and benchmarks tailored for deep research. Finally, we discuss open challenges and promising research directions, aiming to chart a roadmap toward building more capable and trustworthy deep research agents.

Paper Structure

This paper contains 26 sections, 4 equations, 1 figure, 5 tables.

Figures (1)

  • Figure 1: Overview of the deep research system.

Theorems & Definitions (4)

  • definition 1: Planning
  • definition 2: Question Developing
  • definition 3: Web Exploration
  • definition 4: Report Generation