Table of Contents
Fetching ...

InfoSeeker: A Scalable Hierarchical Parallel Agent Framework for Web Information Seeking

Ka Yiu Lee, Yuxuan Huang, Zhiyuan He, Huichi Zhou, Weilin Luo, Kun Shao, Meng Fang, Jun Wang

Abstract

Recent agentic search systems have made substantial progress by emphasising deep, multi-step reasoning. However, this focus often overlooks the challenges of wide-scale information synthesis, where agents must aggregate large volumes of heterogeneous evidence across many sources. As a result, most existing large language model agent systems face severe limitations in data-intensive settings, including context saturation, cascading error propagation, and high end-to-end latency. To address these challenges, we present \framework, a hierarchical framework based on principle of near-decomposability, containing a strategic \textit{Host}, multiple \textit{Managers} and parallel \textit{Workers}. By leveraging aggregation and reflection mechanisms at the Manager layer, our framework enforces strict context isolation to prevent saturation and error propagation. Simultaneously, the parallelism in worker layer accelerates the speed of overall task execution, mitigating the significant latency. Our evaluation on two complementary benchmarks demonstrates both efficiency ($ 3-5 \times$ speed-up) and effectiveness, achieving a $8.4\%$ success rate on WideSearch-en and $52.9\%$ accuracy on BrowseComp-zh. The code is released at https://github.com/agent-on-the-fly/InfoSeeker

InfoSeeker: A Scalable Hierarchical Parallel Agent Framework for Web Information Seeking

Abstract

Recent agentic search systems have made substantial progress by emphasising deep, multi-step reasoning. However, this focus often overlooks the challenges of wide-scale information synthesis, where agents must aggregate large volumes of heterogeneous evidence across many sources. As a result, most existing large language model agent systems face severe limitations in data-intensive settings, including context saturation, cascading error propagation, and high end-to-end latency. To address these challenges, we present \framework, a hierarchical framework based on principle of near-decomposability, containing a strategic \textit{Host}, multiple \textit{Managers} and parallel \textit{Workers}. By leveraging aggregation and reflection mechanisms at the Manager layer, our framework enforces strict context isolation to prevent saturation and error propagation. Simultaneously, the parallelism in worker layer accelerates the speed of overall task execution, mitigating the significant latency. Our evaluation on two complementary benchmarks demonstrates both efficiency ( speed-up) and effectiveness, achieving a success rate on WideSearch-en and accuracy on BrowseComp-zh. The code is released at https://github.com/agent-on-the-fly/InfoSeeker

Paper Structure

This paper contains 25 sections, 10 equations, 15 figures, 7 tables, 1 algorithm.

Figures (15)

  • Figure 1: Performance results on BrowseComp-zh (avg) and WideSearch (avg/max).
  • Figure 2: Overview of the InfoSeeker framework. The system features a three-tier topology consisting of a strategic Host, domain-specific Managers, and tool-executing Workers. By enforcing hierarchical context isolation, high-level directives ($q_t$) are decomposed into parallelisable subtasks ($q_t^k$) by Managers and executed by Workers. Final results are aggregated into concise summaries ($y_t$) to support long-horizon planning while preventing context exhaustion at the strategic level.
  • Figure 3: Time efficiency comparison. InfoSeeker achieves a more than 2$\times$ reduction in inference time, enabled by its efficient parallelism design.
  • Figure 4: The Impact of Worker Pool Size. End-to-end inference time vs. worker-pool size. Larger pools reduce latency by enabling concurrent execution of weakly coupled subtasks.
  • Figure 5: Execution trace: Michelin three-star restaurant synthesis.
  • ...and 10 more figures