Table of Contents
Fetching ...

AgentSwing: Adaptive Parallel Context Management Routing for Long-Horizon Web Agents

Zhaopeng Feng, Liangcai Su, Zhen Zhang, Xinyu Wang, Xiaotian Zhang, Xiaobin Wang, Runnan Fang, Qi Zhang, Baixuan Li, Shihao Cai, Rui Ye, Hui Chen, Jiang Yong, Joey Tianyi Zhou, Chenxiong Qian, Pengjun Xie, Bryan Hooi, Zuozhu Liu, Jingren Zhou

Abstract

As large language models (LLMs) evolve into autonomous agents for long-horizon information-seeking, managing finite context capacity has become a critical bottleneck. Existing context management methods typically commit to a single fixed strategy throughout the entire trajectory. Such static designs may work well in some states, but they cannot adapt as the usefulness and reliability of the accumulated context evolve during long-horizon search. To formalize this challenge, we introduce a probabilistic framework that characterizes long-horizon success through two complementary dimensions: search efficiency and terminal precision. Building on this perspective, we propose AgentSwing, a state-aware adaptive parallel context management routing framework. At each trigger point, AgentSwing expands multiple context-managed branches in parallel and uses lookahead routing to select the most promising continuation. Experiments across diverse benchmarks and agent backbones show that AgentSwing consistently outperforms strong static context management methods, often matching or exceeding their performance with up to $3\times$ fewer interaction turns while also improving the ultimate performance ceiling of long-horizon web agents. Beyond the empirical gains, the proposed probabilistic framework provides a principled lens for analyzing and designing future context management strategies for long-horizon agents.

AgentSwing: Adaptive Parallel Context Management Routing for Long-Horizon Web Agents

Abstract

As large language models (LLMs) evolve into autonomous agents for long-horizon information-seeking, managing finite context capacity has become a critical bottleneck. Existing context management methods typically commit to a single fixed strategy throughout the entire trajectory. Such static designs may work well in some states, but they cannot adapt as the usefulness and reliability of the accumulated context evolve during long-horizon search. To formalize this challenge, we introduce a probabilistic framework that characterizes long-horizon success through two complementary dimensions: search efficiency and terminal precision. Building on this perspective, we propose AgentSwing, a state-aware adaptive parallel context management routing framework. At each trigger point, AgentSwing expands multiple context-managed branches in parallel and uses lookahead routing to select the most promising continuation. Experiments across diverse benchmarks and agent backbones show that AgentSwing consistently outperforms strong static context management methods, often matching or exceeding their performance with up to fewer interaction turns while also improving the ultimate performance ceiling of long-horizon web agents. Beyond the empirical gains, the proposed probabilistic framework provides a principled lens for analyzing and designing future context management strategies for long-horizon agents.

Paper Structure

This paper contains 17 sections, 10 equations, 9 figures, 9 tables.

Figures (9)

  • Figure 1: Performance on BrowseComp under different interaction budgets. Dashed lines denote the baselines without context management.
  • Figure 2: Performance on BrowseComp under Discard-All with different context budgets.
  • Figure 3: Comparison of context management strategies through the efficiency-precision lens. (a) Pass@1 on BrowseComp. (b) Search efficiency $\eta$ vs. terminal precision $\rho$. (c) Aligned terminal precision, where Align-Finish refers to the common finished cases shared by different strategies within the same model.
  • Figure 4: Overview of AgentSwing. AgentSwing triggers context management once the accumulated context exceeds a predefined threshold, executes multiple candidate strategies in parallel, extends each branch for $K$ new turns, and dynamically routes to the most promising continuation.
  • Figure 5: Performance of different context management strategies on BrowseComp over maximum interaction turns.
  • ...and 4 more figures