Table of Contents
Fetching ...

RoboPilot: Generalizable Dynamic Robotic Manipulation with Dual-thinking Modes

Xinyi Liu, Mohammadreza Fani Sani, Zewei Zhou, Julius Wirbel, Bahram Zarrin, Roberto Galeazzi

TL;DR

RoboPilot addresses the challenge of robust, dynamic robotic manipulation by introducing a dual-thinking closed-loop framework that combines fast action-primitives planning with feedback-driven replanning and Chain-of-Thought reasoning. A ModeSelector dynamically switches between fast-thinking and slow-thinking modes to balance efficiency and accuracy, enabling reliable handling of complex, long-horizon tasks. The authors also present RoboPilot-Bench, a two-part benchmark (Canonical Manipulation Suite and Robustness Evaluation Suite) to evaluate performance and robustness under dynamic conditions, including infeasible tasks and error recovery. In simulation and real-world experiments, RoboPilot achieves a 25.9% improvement in task success over state-of-the-art baselines and demonstrates strong robustness in dynamic settings, validating the practical value of adaptive dual-thinking for industrial and service robotics.

Abstract

Despite rapid progress in autonomous robotics, executing complex or long-horizon tasks remains a fundamental challenge. Most current approaches follow an open-loop paradigm with limited reasoning and no feedback, resulting in poor robustness to environmental changes and severe error accumulation. We present RoboPilot, a dual-thinking closed-loop framework for robotic manipulation that supports adaptive reasoning for complex tasks in real-world dynamic environments. RoboPilot leverages primitive actions for structured task planning and flexible action generation, while introducing feedback to enable replanning from dynamic changes and execution errors. Chain-of-Thought reasoning further enhances high-level task planning and guides low-level action generation. The system dynamically switches between fast and slow thinking to balance efficiency and accuracy. To systematically evaluate the robustness of RoboPilot in diverse robot manipulation scenarios, we introduce RoboPilot-Bench, a benchmark spanning 21 tasks across 10 categories, including infeasible-task recognition and failure recovery. Experiments show that RoboPilot outperforms state-of-the-art baselines by 25.9\% in task success rate, and the real-world deployment on an industrial robot further demonstrates its robustness in real-world settings.

RoboPilot: Generalizable Dynamic Robotic Manipulation with Dual-thinking Modes

TL;DR

RoboPilot addresses the challenge of robust, dynamic robotic manipulation by introducing a dual-thinking closed-loop framework that combines fast action-primitives planning with feedback-driven replanning and Chain-of-Thought reasoning. A ModeSelector dynamically switches between fast-thinking and slow-thinking modes to balance efficiency and accuracy, enabling reliable handling of complex, long-horizon tasks. The authors also present RoboPilot-Bench, a two-part benchmark (Canonical Manipulation Suite and Robustness Evaluation Suite) to evaluate performance and robustness under dynamic conditions, including infeasible tasks and error recovery. In simulation and real-world experiments, RoboPilot achieves a 25.9% improvement in task success over state-of-the-art baselines and demonstrates strong robustness in dynamic settings, validating the practical value of adaptive dual-thinking for industrial and service robotics.

Abstract

Despite rapid progress in autonomous robotics, executing complex or long-horizon tasks remains a fundamental challenge. Most current approaches follow an open-loop paradigm with limited reasoning and no feedback, resulting in poor robustness to environmental changes and severe error accumulation. We present RoboPilot, a dual-thinking closed-loop framework for robotic manipulation that supports adaptive reasoning for complex tasks in real-world dynamic environments. RoboPilot leverages primitive actions for structured task planning and flexible action generation, while introducing feedback to enable replanning from dynamic changes and execution errors. Chain-of-Thought reasoning further enhances high-level task planning and guides low-level action generation. The system dynamically switches between fast and slow thinking to balance efficiency and accuracy. To systematically evaluate the robustness of RoboPilot in diverse robot manipulation scenarios, we introduce RoboPilot-Bench, a benchmark spanning 21 tasks across 10 categories, including infeasible-task recognition and failure recovery. Experiments show that RoboPilot outperforms state-of-the-art baselines by 25.9\% in task success rate, and the real-world deployment on an industrial robot further demonstrates its robustness in real-world settings.

Paper Structure

This paper contains 21 sections, 1 equation, 5 figures, 4 tables.

Figures (5)

  • Figure 2: Prompt Snapshot for RoboPilot. The prompt includes three parts for the ModeSelector, CoT Reasoning, and Action Generation module. The prompt used in the fast-thinking mode is identical to that of the slow-thinking mode, but omits the reasoning prompt.
  • Figure 3: Task definition and difficulty in RoboPilot-Bench, including the Canonical Manipulation Suite (5 Groups with 13 Tasks) and Robustness Evaluation Suite (5 Groups with 8 Tasks). A three-pair example is provided with red arrows highlighting one possible solution. The difficulty score between 1 (easy) and 5 (hard) is shown in the top left corner.
  • Figure 4: Qualitative Experimental Results: (a) Simple sequential reasoning task (Fast-Thinking Mode). (b) A spatial reasoning task (Slow-Thinking Mode). (c) An error recovery long-horizon task (Slow-Thinking Mode).
  • Figure 5: Real-world Experiment Results on UR3E Robot. Our RoboPilot demonstrates its performance and robustness across diverse manipulation tasks, especially the error recovery tasks.
  • Figure 6: ModeSelector Performance Analysis