Table of Contents
Fetching ...

Ask, Reason, Assist: Robot Collaboration via Natural Language and Temporal Logic

Dan BW Choe, Sundhar Vinodh Sangeetha, Steven Emanuel, Chih-Yuan Chiu, Samuel Coogan, Shreyas Kousik

TL;DR

This work proposes a peer-to-peer coordination protocol that enables robots to request and provide help without a central task allocator, and achieves performance comparable to a centralized"Oracle"baseline but without heavy information demands.

Abstract

Increased robot deployment, such as in warehousing, has revealed a need for collaboration among heterogeneous robot teams to resolve unforeseen conflicts. To this end, we propose a peer-to-peer coordination protocol that enables robots to request and provide help without a central task allocator. The process begins when a robot detects a conflict and uses a Large Language Model (LLM) to decide whether external assistance is required. If so, it crafts and broadcasts a natural language (NL) help request. Potential helper robots reason over the request and respond with offers of assistance, including information about the effect on their ongoing tasks. Helper reasoning is implemented via an LLM grounded in Signal Temporal Logic (STL) using a Backus-Naur Form (BNF) grammar, ensuring syntactically valid NL-to-STL translations, which are then solved as a Mixed Integer Linear Program (MILP). Finally, the requester robot selects a helper by reasoning over the expected increase in system-level total task completion time. We evaluated our framework through experiments comparing different helper-selection strategies and found that considering multiple offers allows the requester to minimize added makespan. Our approach significantly outperforms heuristics such as selecting the nearest available candidate helper robot, and achieves performance comparable to a centralized "Oracle" baseline but without heavy information demands.

Ask, Reason, Assist: Robot Collaboration via Natural Language and Temporal Logic

TL;DR

This work proposes a peer-to-peer coordination protocol that enables robots to request and provide help without a central task allocator, and achieves performance comparable to a centralized"Oracle"baseline but without heavy information demands.

Abstract

Increased robot deployment, such as in warehousing, has revealed a need for collaboration among heterogeneous robot teams to resolve unforeseen conflicts. To this end, we propose a peer-to-peer coordination protocol that enables robots to request and provide help without a central task allocator. The process begins when a robot detects a conflict and uses a Large Language Model (LLM) to decide whether external assistance is required. If so, it crafts and broadcasts a natural language (NL) help request. Potential helper robots reason over the request and respond with offers of assistance, including information about the effect on their ongoing tasks. Helper reasoning is implemented via an LLM grounded in Signal Temporal Logic (STL) using a Backus-Naur Form (BNF) grammar, ensuring syntactically valid NL-to-STL translations, which are then solved as a Mixed Integer Linear Program (MILP). Finally, the requester robot selects a helper by reasoning over the expected increase in system-level total task completion time. We evaluated our framework through experiments comparing different helper-selection strategies and found that considering multiple offers allows the requester to minimize added makespan. Our approach significantly outperforms heuristics such as selecting the nearest available candidate helper robot, and achieves performance comparable to a centralized "Oracle" baseline but without heavy information demands.

Paper Structure

This paper contains 24 sections, 16 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Overview of the Proposed Framework: A requester broadcasts a natural language help request, which helpers translate into syntactically valid temporal logic (TL) via a BNF grammar. Then each helper independently solves an updated optimal path via MILP to assess the cost associated with the help task and proposes help. Finally, the requester selects and confirms the best offer that minimizes the overall impact to the system. Our framework demonstrates how natural language (NL) can serve as a flexible medium for a heterogeneous multi-robot help-request forum. Using constrained generation with BNF grammar guarantees valid TL translations, while solving the decentralized MILP optimization problem achieves performance close to a centralized "Oracle" baseline.
  • Figure 2: An example of a reconfigured path. The help site is reached in 2 time-steps, extending the original path by 2 time-steps for a total cost $\tau^{\mathrm{\textnormal{h}}}_j + \tau^{\mathrm{\textnormal{new}}}_j = 4$ time-steps.
  • Figure 3: Box plot comparison of total time-steps added to the system under different methods tested in \ref{['subsec: experiments: experiment 2']}. Our method tracks the Oracle solution within 22% (mean) while significantly outperforming the distance based heuristics (B2) and hybrid approach (B3).
  • Figure 4: Unity-based demonstrations of our NL-to-TL decentralized framework. (Left) Pallet Cleanup: a forklift responds to an NL request to clear a blocked aisle, updating its MILP plan to integrate the help task. (Center) Warehouse Kitting: unordered conjunctive goals ("pick A,B,C") are translated into a concise TL formula allowing flexible task sequencing. (Right) Sequential Tool Retrieval: strict temporal ordering ("first-then-finally") is captured by the nested TL specification, enabling execution of complex multi-step tasks. These demonstrations showcase our end-to-end framework---where each helper robot can translate natural language into valid temporal logic and update its MILP plan---running in a realistic Unity warehouse simulator.