Table of Contents
Fetching ...

Enabling Novel Mission Operations and Interactions with ROSA: The Robot Operating System Agent

Rob Royce, Marcel Kaufmann, Jonathan Becktor, Sangwoo Moon, Kalind Carpenter, Kai Pak, Amanda Towler, Rohan Thakker, Shehryar Khattak

TL;DR

ROSA addresses the challenge of making complex robotic platforms accessible to non-experts by connecting natural language interfaces to ROS through a ReAct-based agent. It combines an extensible tool-based action space, lightweight memory, and LangChain-driven logic to translate user intents into ROS actions while providing safety and transparency. The approach is demonstrated across multiple robots and environments, highlighting improved usability, multi-modal interactions, and multilingual support, with a strong emphasis on ethical governance and safety. The work offers a practical, open-source framework for democratizing robotics and guiding responsible AI deployment in mission operations.

Abstract

The advancement of robotic systems has revolutionized numerous industries, yet their operation often demands specialized technical knowledge, limiting accessibility for non-expert users. This paper introduces ROSA (Robot Operating System Agent), an AI-powered agent that bridges the gap between the Robot Operating System (ROS) and natural language interfaces. By leveraging state-of-the-art language models and integrating open-source frameworks, ROSA enables operators to interact with robots using natural language, translating commands into actions and interfacing with ROS through well-defined tools. ROSA's design is modular and extensible, offering seamless integration with both ROS1 and ROS2, along with safety mechanisms like parameter validation and constraint enforcement to ensure secure, reliable operations. While ROSA is originally designed for ROS, it can be extended to work with other robotics middle-wares to maximize compatibility across missions. ROSA enhances human-robot interaction by democratizing access to complex robotic systems, empowering users of all expertise levels with multi-modal capabilities such as speech integration and visual perception. Ethical considerations are thoroughly addressed, guided by foundational principles like Asimov's Three Laws of Robotics, ensuring that AI integration promotes safety, transparency, privacy, and accountability. By making robotic technology more user-friendly and accessible, ROSA not only improves operational efficiency but also sets a new standard for responsible AI use in robotics and potentially future mission operations. This paper introduces ROSA's architecture and showcases initial mock-up operations in JPL's Mars Yard, a laboratory, and a simulation using three different robots. The core ROSA library is available as open-source.

Enabling Novel Mission Operations and Interactions with ROSA: The Robot Operating System Agent

TL;DR

ROSA addresses the challenge of making complex robotic platforms accessible to non-experts by connecting natural language interfaces to ROS through a ReAct-based agent. It combines an extensible tool-based action space, lightweight memory, and LangChain-driven logic to translate user intents into ROS actions while providing safety and transparency. The approach is demonstrated across multiple robots and environments, highlighting improved usability, multi-modal interactions, and multilingual support, with a strong emphasis on ethical governance and safety. The work offers a practical, open-source framework for democratizing robotics and guiding responsible AI deployment in mission operations.

Abstract

The advancement of robotic systems has revolutionized numerous industries, yet their operation often demands specialized technical knowledge, limiting accessibility for non-expert users. This paper introduces ROSA (Robot Operating System Agent), an AI-powered agent that bridges the gap between the Robot Operating System (ROS) and natural language interfaces. By leveraging state-of-the-art language models and integrating open-source frameworks, ROSA enables operators to interact with robots using natural language, translating commands into actions and interfacing with ROS through well-defined tools. ROSA's design is modular and extensible, offering seamless integration with both ROS1 and ROS2, along with safety mechanisms like parameter validation and constraint enforcement to ensure secure, reliable operations. While ROSA is originally designed for ROS, it can be extended to work with other robotics middle-wares to maximize compatibility across missions. ROSA enhances human-robot interaction by democratizing access to complex robotic systems, empowering users of all expertise levels with multi-modal capabilities such as speech integration and visual perception. Ethical considerations are thoroughly addressed, guided by foundational principles like Asimov's Three Laws of Robotics, ensuring that AI integration promotes safety, transparency, privacy, and accountability. By making robotic technology more user-friendly and accessible, ROSA not only improves operational efficiency but also sets a new standard for responsible AI use in robotics and potentially future mission operations. This paper introduces ROSA's architecture and showcases initial mock-up operations in JPL's Mars Yard, a laboratory, and a simulation using three different robots. The core ROSA library is available as open-source.
Paper Structure (45 sections, 13 figures)

This paper contains 45 sections, 13 figures.

Figures (13)

  • Figure 1: ROSA has been deployed on various robots at NASA Jet Propulsion Laboratory, including NeBula-Spot in the Mars Yard (a), Exobiology Extant Life Surveyor (EELS) in a laboratory environment using synthetic terrain and obstacles (b), and Nova Carter in NVIDIA IsaacSim using a simulated Martian environment (c). Each of these integrations and a demonstration of their capabilities when equipped with ROSA are detailed in Section \ref{['sec:experiments']}.
  • Figure 2: A basic ROS system with three nodes (Operator UI, Camera, and Object Detection), one topic (Image), one service client, and one service provider (Start Recording). Image adapted from ROS Tutorials.
  • Figure 3: An example reasoning trace that demonstrates the reasoning-action-observation loop characteristic of ReAct agents. This trace demonstrates a typical response from ROSA when given the query "Provide me with a list of ROS nodes."
  • Figure 4: The ROSA architecture includes memory (chat history and scratchpad), logic, tools (action space), and a tool-calling Large Language Model. Arrows in the diagram represent the general connectivity between components and the external robotics environment.
  • Figure 5: Sequence diagram demonstrating sequential, parallel, single-, and multi-tool calling. Initially, a single tool is called to retrieve a status overview. The model infers that certain subsystems need to be inspected further, so it retrieves detailed status information in parallel for each subsystem before returning a final result.
  • ...and 8 more figures