Table of Contents
Fetching ...

WebRollback: Enhancing Web Agents with Explicit Rollback Mechanisms

Zhisong Zhang, Tianqing Fang, Kaixin Ma, Wenhao Yu, Hongming Zhang, Haitao Mi, Dong Yu

TL;DR

WebRollback addresses the brittleness of web agents in dynamic environments by adding an explicit rollback mechanism to the search process. The approach introduces a critique module and a rollback module that allow multi-step reversions, balancing exploration and efficiency. Experiments on Mind2Web-Live and WebVoyager across zero-shot and fine-tuning regimes show that rollback improves task success and reduces unnecessary state switching, often outperforming OneWay and BestFirst. The work demonstrates a practical path to more robust, budget-conscious web navigation with model-driven control over the search trajectory.

Abstract

With recent advancements in large language models, web agents have been greatly improved. However, dealing with complex and dynamic web environments requires more advanced planning and search abilities. Previous studies usually adopt a greedy one-way search strategy, which may struggle to recover from erroneous states. In this work, we enhance web agents with an explicit rollback mechanism, enabling the agent to revert back to a previous state in its navigation trajectory. This mechanism gives the model the flexibility to directly control the search process, leading to an effective and efficient web navigation method. We conduct experiments on two live web navigation benchmarks with zero-shot and fine-tuning settings. The results demonstrate the effectiveness of our proposed approach.

WebRollback: Enhancing Web Agents with Explicit Rollback Mechanisms

TL;DR

WebRollback addresses the brittleness of web agents in dynamic environments by adding an explicit rollback mechanism to the search process. The approach introduces a critique module and a rollback module that allow multi-step reversions, balancing exploration and efficiency. Experiments on Mind2Web-Live and WebVoyager across zero-shot and fine-tuning regimes show that rollback improves task success and reduces unnecessary state switching, often outperforming OneWay and BestFirst. The work demonstrates a practical path to more robust, budget-conscious web navigation with model-driven control over the search trajectory.

Abstract

With recent advancements in large language models, web agents have been greatly improved. However, dealing with complex and dynamic web environments requires more advanced planning and search abilities. Previous studies usually adopt a greedy one-way search strategy, which may struggle to recover from erroneous states. In this work, we enhance web agents with an explicit rollback mechanism, enabling the agent to revert back to a previous state in its navigation trajectory. This mechanism gives the model the flexibility to directly control the search process, leading to an effective and efficient web navigation method. We conduct experiments on two live web navigation benchmarks with zero-shot and fine-tuning settings. The results demonstrate the effectiveness of our proposed approach.

Paper Structure

This paper contains 17 sections, 3 figures, 4 tables, 1 algorithm.

Figures (3)

  • Figure 1: An overview of different search strategies. The OneWay strategy may get stuck in erroneous states, the BestFirst strategy perform rollback based on state values, while our proposed strategy directly let the models to decide when and where to rollback.
  • Figure 2: Task-finishing rate analysis. Here, $x$-axis denotes the number of steps the agent takes, and the $y$-axis denotes the percentage of the tasks that can be finished within a specific step limit.
  • Figure 3: Results with different maximum step budgets. The light bars indicate the Partial% scores, while the darker and shaded parts represent Full% scores.