Table of Contents
Fetching ...

HiRA: A Hierarchical Reasoning Framework for Decoupled Planning and Execution in Deep Search

Jiajie Jin, Xiaoxi Li, Guanting Dong, Yuyao Zhang, Yutao Zhu, Yang Zhao, Hongjin Qian, Zhicheng Dou

TL;DR

HiRA tackles deep search tasks by decoupling strategic planning from execution through a three-layer, hierarchical framework: a meta reasoning planner, an adaptive reasoning coordinator, and domain-specific executors. This design enables dynamic task decomposition, efficient delegation to specialized agents, and distilled reasoning feedback to maintain coherent multi-step reasoning. Empirical results on four complex, cross-modal benchmarks show HiRA significantly outperforms traditional RAG and existing agent-based methods, with notable gains in both answer quality and system efficiency. The approach advances scalable deep search by integrating modular reasoning with external tools and memory, enabling robust real-world information synthesis.

Abstract

Complex information needs in real-world search scenarios demand deep reasoning and knowledge synthesis across diverse sources, which traditional retrieval-augmented generation (RAG) pipelines struggle to address effectively. Current reasoning-based approaches suffer from a fundamental limitation: they use a single model to handle both high-level planning and detailed execution, leading to inefficient reasoning and limited scalability. In this paper, we introduce HiRA, a hierarchical framework that separates strategic planning from specialized execution. Our approach decomposes complex search tasks into focused subtasks, assigns each subtask to domain-specific agents equipped with external tools and reasoning capabilities, and coordinates the results through a structured integration mechanism. This separation prevents execution details from disrupting high-level reasoning while enabling the system to leverage specialized expertise for different types of information processing. Experiments on four complex, cross-modal deep search benchmarks demonstrate that HiRA significantly outperforms state-of-the-art RAG and agent-based systems. Our results show improvements in both answer quality and system efficiency, highlighting the effectiveness of decoupled planning and execution for multi-step information seeking tasks. Our code is available at https://github.com/ignorejjj/HiRA.

HiRA: A Hierarchical Reasoning Framework for Decoupled Planning and Execution in Deep Search

TL;DR

HiRA tackles deep search tasks by decoupling strategic planning from execution through a three-layer, hierarchical framework: a meta reasoning planner, an adaptive reasoning coordinator, and domain-specific executors. This design enables dynamic task decomposition, efficient delegation to specialized agents, and distilled reasoning feedback to maintain coherent multi-step reasoning. Empirical results on four complex, cross-modal benchmarks show HiRA significantly outperforms traditional RAG and existing agent-based methods, with notable gains in both answer quality and system efficiency. The approach advances scalable deep search by integrating modular reasoning with external tools and memory, enabling robust real-world information synthesis.

Abstract

Complex information needs in real-world search scenarios demand deep reasoning and knowledge synthesis across diverse sources, which traditional retrieval-augmented generation (RAG) pipelines struggle to address effectively. Current reasoning-based approaches suffer from a fundamental limitation: they use a single model to handle both high-level planning and detailed execution, leading to inefficient reasoning and limited scalability. In this paper, we introduce HiRA, a hierarchical framework that separates strategic planning from specialized execution. Our approach decomposes complex search tasks into focused subtasks, assigns each subtask to domain-specific agents equipped with external tools and reasoning capabilities, and coordinates the results through a structured integration mechanism. This separation prevents execution details from disrupting high-level reasoning while enabling the system to leverage specialized expertise for different types of information processing. Experiments on four complex, cross-modal deep search benchmarks demonstrate that HiRA significantly outperforms state-of-the-art RAG and agent-based systems. Our results show improvements in both answer quality and system efficiency, highlighting the effectiveness of decoupled planning and execution for multi-step information seeking tasks. Our code is available at https://github.com/ignorejjj/HiRA.

Paper Structure

This paper contains 31 sections, 6 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Comparison of current approaches for deep search tasks: (a) Direct Reasoning with LRMs, (b) Search-Augmented Reasoning that enables LRMs to use a search engine during reasoning, and (c) Hierarchical Reasoning that autonomously interacts with expert agents and tools in a continuous thinking process.
  • Figure 2: Overview of the HiRA Framework.
  • Figure 3: Performance comparison on whether the expert agent description is provided to the meta planner and maximum number of sub-tasks limit.
  • Figure 4: Comparison of our method with the baseline on three GAIA subsets, evaluating performance across different dimensions of capability.
  • Figure 5: Comparison of different methods in terms of reasoning length (number of output tokens during model inference) and interaction times (number of interactions with the environment during inference) in three datasets.