Interactive Speculative Planning: Enhance Agent Efficiency through Co-design of System and User Interface

Wenyue Hua; Mengting Wan; Shashank Vadrevu; Ryan Nadel; Yongfeng Zhang; Chi Wang

Interactive Speculative Planning: Enhance Agent Efficiency through Co-design of System and User Interface

Wenyue Hua, Mengting Wan, Shashank Vadrevu, Ryan Nadel, Yongfeng Zhang, Chi Wang

TL;DR

Interactive Speculative Planning (ISP) addresses latency in LLM-based agents by co-designing a system that leverages speculative planning with an approximation agent $ ext{A}$, a target agent $ ext{T}$, and active human involvement. The method runs $ ext{A}$ and asynchronous $ ext{T}$ in parallel, using user interventions to correct or accelerate steps, with a UI-rescheduling mechanism ensuring sequential, comprehensible results. The paper provides theoretical latency and token analyses, supported by simulation, and demonstrates empirical gains on OpenAGI and TravelPlanner benchmarks across four settings, showing substantial reductions in total and stepwise latency and favorable cost dynamics. The work highlights practical benefits for user experience and automation efficiency, while noting limitations in matching strategies, security concerns, and UI features, with clear directions for future improvement. Overall, ISP offers a training-free, adaptable framework for dynamic, user-aware agent planning that can accelerate complex, multi-step tasks in real-world applications.

Abstract

Agents, as user-centric tools, are increasingly deployed for human task delegation, assisting with a broad spectrum of requests by generating thoughts, engaging with user proxies, and producing action plans. However, agents based on large language models (LLMs) often face substantial planning latency due to two primary factors: the efficiency limitations of the underlying LLMs due to their large size and high demand, and the structural complexity of the agents due to the extensive generation of intermediate thoughts to produce the final output. Given that inefficiency in service provision can undermine the value of automation for users, this paper presents a human-centered efficient agent planning method -- Interactive Speculative Planning -- aiming at enhancing the efficiency of agent planning through both system design and human-AI interaction. Our approach advocates for the co-design of the agent system and user interface, underscoring the importance of an agent system that can fluidly manage user interactions and interruptions. By integrating human interruptions as a fundamental component of the system, we not only make it more user-centric but also expedite the entire process by leveraging human-in-the-loop interactions to provide accurate intermediate steps. Code and data will be released.

Interactive Speculative Planning: Enhance Agent Efficiency through Co-design of System and User Interface

TL;DR

Interactive Speculative Planning (ISP) addresses latency in LLM-based agents by co-designing a system that leverages speculative planning with an approximation agent

, a target agent

, and active human involvement. The method runs

and asynchronous

in parallel, using user interventions to correct or accelerate steps, with a UI-rescheduling mechanism ensuring sequential, comprehensible results. The paper provides theoretical latency and token analyses, supported by simulation, and demonstrates empirical gains on OpenAGI and TravelPlanner benchmarks across four settings, showing substantial reductions in total and stepwise latency and favorable cost dynamics. The work highlights practical benefits for user experience and automation efficiency, while noting limitations in matching strategies, security concerns, and UI features, with clear directions for future improvement. Overall, ISP offers a training-free, adaptable framework for dynamic, user-aware agent planning that can accelerate complex, multi-step tasks in real-world applications.

Abstract

Paper Structure (32 sections, 10 equations, 22 figures, 3 tables, 2 algorithms)

This paper contains 32 sections, 10 equations, 22 figures, 3 tables, 2 algorithms.

Introduction
Related Work
Interactive Speculative Planning
Speculative Planning
UI Interaction Algorithm
Efficiency Analysis
Latency Analysis
Total token required
Rate Required
Simulation Experiment for Speculative Planning
Experiment
Benchmarks
Speculative Planning Settings
Setting 1
Setting 2
...and 17 more sections

Figures (22)

Figure 1: Interactive Speculative Planning: user query is handled by speculative planning with approximation and target agent. Then a rescheduling mechanism serializes the computed result on UI and enables the user to actively interact with the system for further acceleration. Finger-pointed action is the action intervened by the user.
Figure 2: Speculative Planning Algorithm Demonstration, where the cross symbol indicates the step where the $\mathcal{A}$'s computed result differs from that of $\mathcal{T}$.
Figure 3: User interface which guarantees a sequential presentation of approximation agent's output and target agent's output with minimum perceived latency.
Figure 4: Comparing the time taken to generate the first two steps of a task by agent system using speculative planning and normal agent planning.
Figure 5: UI interface issues stemming from immediate presentation of computed action steps.
...and 17 more figures

Interactive Speculative Planning: Enhance Agent Efficiency through Co-design of System and User Interface

TL;DR

Abstract

Interactive Speculative Planning: Enhance Agent Efficiency through Co-design of System and User Interface

Authors

TL;DR

Abstract

Table of Contents

Figures (22)