Interactive Speculative Planning: Enhance Agent Efficiency through Co-design of System and User Interface
Wenyue Hua, Mengting Wan, Shashank Vadrevu, Ryan Nadel, Yongfeng Zhang, Chi Wang
TL;DR
Interactive Speculative Planning (ISP) addresses latency in LLM-based agents by co-designing a system that leverages speculative planning with an approximation agent $ ext{A}$, a target agent $ ext{T}$, and active human involvement. The method runs $ ext{A}$ and asynchronous $ ext{T}$ in parallel, using user interventions to correct or accelerate steps, with a UI-rescheduling mechanism ensuring sequential, comprehensible results. The paper provides theoretical latency and token analyses, supported by simulation, and demonstrates empirical gains on OpenAGI and TravelPlanner benchmarks across four settings, showing substantial reductions in total and stepwise latency and favorable cost dynamics. The work highlights practical benefits for user experience and automation efficiency, while noting limitations in matching strategies, security concerns, and UI features, with clear directions for future improvement. Overall, ISP offers a training-free, adaptable framework for dynamic, user-aware agent planning that can accelerate complex, multi-step tasks in real-world applications.
Abstract
Agents, as user-centric tools, are increasingly deployed for human task delegation, assisting with a broad spectrum of requests by generating thoughts, engaging with user proxies, and producing action plans. However, agents based on large language models (LLMs) often face substantial planning latency due to two primary factors: the efficiency limitations of the underlying LLMs due to their large size and high demand, and the structural complexity of the agents due to the extensive generation of intermediate thoughts to produce the final output. Given that inefficiency in service provision can undermine the value of automation for users, this paper presents a human-centered efficient agent planning method -- Interactive Speculative Planning -- aiming at enhancing the efficiency of agent planning through both system design and human-AI interaction. Our approach advocates for the co-design of the agent system and user interface, underscoring the importance of an agent system that can fluidly manage user interactions and interruptions. By integrating human interruptions as a fundamental component of the system, we not only make it more user-centric but also expedite the entire process by leveraging human-in-the-loop interactions to provide accurate intermediate steps. Code and data will be released.
