Table of Contents
Fetching ...

Generative AI Agents in Autonomous Machines: A Safety Perspective

Jason Jabbour, Vijay Janapa Reddi

TL;DR

This work investigates the evolving safety requirements when generative models are integrated as agents into physical autonomous machines, comparing these to safety considerations in less critical AI applications and recommends the development and implementation of comprehensive safety scorecards for the use of generative AI technologies in autonomous machines.

Abstract

The integration of Generative Artificial Intelligence (AI) into autonomous machines represents a major paradigm shift in how these systems operate and unlocks new solutions to problems once deemed intractable. Although generative AI agents provide unparalleled capabilities, they also have unique safety concerns. These challenges require robust safeguards, especially for autonomous machines that operate in high-stakes environments. This work investigates the evolving safety requirements when generative models are integrated as agents into physical autonomous machines, comparing these to safety considerations in less critical AI applications. We explore the challenges and opportunities to ensure the safe deployment of generative AI-driven autonomous machines. Furthermore, we provide a forward-looking perspective on the future of AI-driven autonomous systems and emphasize the importance of evaluating and communicating safety risks. As an important step towards addressing these concerns, we recommend the development and implementation of comprehensive safety scorecards for the use of generative AI technologies in autonomous machines.

Generative AI Agents in Autonomous Machines: A Safety Perspective

TL;DR

This work investigates the evolving safety requirements when generative models are integrated as agents into physical autonomous machines, comparing these to safety considerations in less critical AI applications and recommends the development and implementation of comprehensive safety scorecards for the use of generative AI technologies in autonomous machines.

Abstract

The integration of Generative Artificial Intelligence (AI) into autonomous machines represents a major paradigm shift in how these systems operate and unlocks new solutions to problems once deemed intractable. Although generative AI agents provide unparalleled capabilities, they also have unique safety concerns. These challenges require robust safeguards, especially for autonomous machines that operate in high-stakes environments. This work investigates the evolving safety requirements when generative models are integrated as agents into physical autonomous machines, comparing these to safety considerations in less critical AI applications. We explore the challenges and opportunities to ensure the safe deployment of generative AI-driven autonomous machines. Furthermore, we provide a forward-looking perspective on the future of AI-driven autonomous systems and emphasize the importance of evaluating and communicating safety risks. As an important step towards addressing these concerns, we recommend the development and implementation of comprehensive safety scorecards for the use of generative AI technologies in autonomous machines.

Paper Structure

This paper contains 29 sections, 5 figures.

Figures (5)

  • Figure 1: Publication trend from IROS, ICRA, and RSS proceedings, using the keywords LLMs, generative models, and embodied AI. The data shows a sharp rise in papers on generative methods in robotics, especially since 2018, reflecting growing research interest.
  • Figure 2: Server-grade setup (8 NVIDIA H100 GPUs with Intel(R) Xeon(R) Platinum 8468) versus a common edge setup for robotics (NVIDIA Jetson AGX Orin) from the MLPerf Inference benchmarks on the GPT-J 6B model for the LLM summarization task. It shows the discrepancy in tokens/second and power consumption mlcommons_benchmarknvidia_jetson_orinhyperstack_h100_benchmarks.
  • Figure 3: Max runtime comparison of an 8 NVIDIA H100 GPU setup versus a NVIDIA Jetson AGX Orin on a 932.4 Wh battery pack of an ANYmal quadruped anymal_specs, conservatively assuming actuators only consume 50% of the power wu2023review and 100% FLOP utilization is achieved. This highlights the infeasibility of achieving server-grade throughput on an edge platform due to battery limitations.
  • Figure 4: Memory required to load LLMs (assuming FP16 parameters) and benchmarked step-by-step reasoning performance scale_tool_use. The Jetson AGX Orin is limited to small models (e.g. LLama 3.1-8B), due to memory constraints-- thus impacting reasoning capabilities.
  • Figure 5: Example of a "safety scorecard" for generative models across four levels of the computing stack in an autonomous system.