Table of Contents
Fetching ...

MistyPilot: An Agentic Fast-Slow Thinking LLM Framework for Misty Social Robots

Xiao Wang, Lu Dong, Jingchen Sun, Ifeoma Nwogu, Srirangaraj Setlur, Venu Govindaraju

TL;DR

MistyPilot is introduced, an agentic LLM-driven framework for autonomous tool selection, orchestration, and parameter configuration that integrates a fast-slow thinking paradigm to capture user preferences, reduce latency, and improve task efficiency.

Abstract

With the availability of open APIs in social robots, it has become easier to customize general-purpose tools to meet users' needs. However, interpreting high-level user instructions, selecting and configuring appropriate tools, and executing them reliably remain challenging for users without programming experience. To address these challenges, we introduce MistyPilot, an agentic LLM-driven framework for autonomous tool selection, orchestration, and parameter configuration. MistyPilot comprises two core components: a Physically Interactive Agent (PIA) and a Socially Intelligent Agent (SIA). The PIA enables robust sensor-triggered and tool-driven task execution, while the SIA generates socially intelligent and emotionally aligned dialogue. MistyPilot further integrates a fast-slow thinking paradigm to capture user preferences, reduce latency, and improve task efficiency. To comprehensively evaluate MistyPilot, we contribute five benchmark datasets. Extensive experiments demonstrate the effectiveness of our framework in routing correctness, task completeness, fast-slow thinking retrieval efficiency, tool scalability,and emotion alignment. All code, datasets, and experimental videos will be made publicly available on the project webpage.

MistyPilot: An Agentic Fast-Slow Thinking LLM Framework for Misty Social Robots

TL;DR

MistyPilot is introduced, an agentic LLM-driven framework for autonomous tool selection, orchestration, and parameter configuration that integrates a fast-slow thinking paradigm to capture user preferences, reduce latency, and improve task efficiency.

Abstract

With the availability of open APIs in social robots, it has become easier to customize general-purpose tools to meet users' needs. However, interpreting high-level user instructions, selecting and configuring appropriate tools, and executing them reliably remain challenging for users without programming experience. To address these challenges, we introduce MistyPilot, an agentic LLM-driven framework for autonomous tool selection, orchestration, and parameter configuration. MistyPilot comprises two core components: a Physically Interactive Agent (PIA) and a Socially Intelligent Agent (SIA). The PIA enables robust sensor-triggered and tool-driven task execution, while the SIA generates socially intelligent and emotionally aligned dialogue. MistyPilot further integrates a fast-slow thinking paradigm to capture user preferences, reduce latency, and improve task efficiency. To comprehensively evaluate MistyPilot, we contribute five benchmark datasets. Extensive experiments demonstrate the effectiveness of our framework in routing correctness, task completeness, fast-slow thinking retrieval efficiency, tool scalability,and emotion alignment. All code, datasets, and experimental videos will be made publicly available on the project webpage.
Paper Structure (18 sections, 4 equations, 2 figures, 11 tables)

This paper contains 18 sections, 4 equations, 2 figures, 11 tables.

Figures (2)

  • Figure 1: Overview of MistyPilot workflow for interpreting high-level human instructions. Instead of requiring professionals to hand-code robot behaviors and deploy features on the Misty robot, MistyPilot parses natural-language instructions, analyzes the task, selects and parameterizes tools from its library, and executes them on Misty Robot.
  • Figure 2: Overview of the MistyPilot framework. MistyPilot interprets high-level user instructions through a Task Router that dispatches tasks to either a Physically Interactive Agent (PIA) or a Socially Intelligent Agent (SIA) for automated tool utilization and parameter adaptation. The PIA oversees a Sensor & Tool Manager to dynamically orchestrate sensor-related and tool-dependent tasks, while the SIA maintains a Task Status Manager to track the current task state. Fast Thinking accelerates inference by memory hitting for stored knowledge and preferences, while Slow Thinking employs a Script Writer to interpret the task, align responses with fine-grained emotional expression, and deliver them via speaking and movement modules.