Table of Contents
Fetching ...

LLM$\times$MapReduce-V3: Enabling Interactive In-Depth Survey Generation through a MCP-Driven Hierarchically Modular Agent System

Yu Chao, Siyu Lin, xiaorong wang, Zhu Zhang, Zihan Zhou, Haoyu Wang, Shuo Wang, Jie Zhou, Zhiyuan Liu, Maosong Sun

TL;DR

LLM×MapReduce-V3 presents a hierarchically modular, MCP-based agent architecture for interactive long-form survey generation. It decomposes core tasks into specialized MCP servers (e.g., Skeleton Initialization, Digest Construction, Skeleton Refinement) and uses a high-level planner $\pi: \mathcal{H} \times \mathcal{C} \rightarrow \mathcal{T}^*$ to dynamically orchestrate module invocation, enabling non-linear, multi-turn workflows. A human-in-the-loop framework guides topic consensus and outline refinement, ensuring alignment with user expertise. Human expert evaluations show improved skeleton quality and longer, more informative surveys than representative baselines, underscoring the value of modular MCP planning and open, replaceable components.

Abstract

We introduce LLM x MapReduce-V3, a hierarchically modular agent system designed for long-form survey generation. Building on the prior work, LLM x MapReduce-V2, this version incorporates a multi-agent architecture where individual functional components, such as skeleton initialization, digest construction, and skeleton refinement, are implemented as independent model-context-protocol (MCP) servers. These atomic servers can be aggregated into higher-level servers, creating a hierarchically structured system. A high-level planner agent dynamically orchestrates the workflow by selecting appropriate modules based on their MCP tool descriptions and the execution history. This modular decomposition facilitates human-in-the-loop intervention, affording users greater control and customization over the research process. Through a multi-turn interaction, the system precisely captures the intended research perspectives to generate a comprehensive skeleton, which is then developed into an in-depth survey. Human evaluations demonstrate that our system surpasses representative baselines in both content depth and length, highlighting the strength of MCP-based modular planning.

LLM$\times$MapReduce-V3: Enabling Interactive In-Depth Survey Generation through a MCP-Driven Hierarchically Modular Agent System

TL;DR

LLM×MapReduce-V3 presents a hierarchically modular, MCP-based agent architecture for interactive long-form survey generation. It decomposes core tasks into specialized MCP servers (e.g., Skeleton Initialization, Digest Construction, Skeleton Refinement) and uses a high-level planner to dynamically orchestrate module invocation, enabling non-linear, multi-turn workflows. A human-in-the-loop framework guides topic consensus and outline refinement, ensuring alignment with user expertise. Human expert evaluations show improved skeleton quality and longer, more informative surveys than representative baselines, underscoring the value of modular MCP planning and open, replaceable components.

Abstract

We introduce LLM x MapReduce-V3, a hierarchically modular agent system designed for long-form survey generation. Building on the prior work, LLM x MapReduce-V2, this version incorporates a multi-agent architecture where individual functional components, such as skeleton initialization, digest construction, and skeleton refinement, are implemented as independent model-context-protocol (MCP) servers. These atomic servers can be aggregated into higher-level servers, creating a hierarchically structured system. A high-level planner agent dynamically orchestrates the workflow by selecting appropriate modules based on their MCP tool descriptions and the execution history. This modular decomposition facilitates human-in-the-loop intervention, affording users greater control and customization over the research process. Through a multi-turn interaction, the system precisely captures the intended research perspectives to generate a comprehensive skeleton, which is then developed into an in-depth survey. Human evaluations demonstrate that our system surpasses representative baselines in both content depth and length, highlighting the strength of MCP-based modular planning.

Paper Structure

This paper contains 29 sections, 6 equations, 1 figure, 2 tables.

Figures (1)

  • Figure 1: Our agent-server ecosystem pipeline. Users begin by specifying a topic, optionally adding detailed descriptions or uploading documents. The Analysis Agent interprets user intent and coordinates with the Search Agent to retrieve and organize relevant literature. The Skeleton Agent then generates and refines an outline, which is used by the Writing Agent to complete the paper.