Table of Contents
Fetching ...

Vision Language Model-based Testing of Industrial Autonomous Mobile Robots

Jiahui Wu, Chengjie Lu, Aitor Arrieta, Shaukat Ali, Thomas Peyrucain

TL;DR

This work proposes a Vision Language Model (VLM)-based testing approach (RVSG) for industrial AMRs developed together with PAL Robotics, and evaluates RVSG with several requirements and navigation routes in a simulator using the latest AMR from PAL Robotics.

Abstract

PAL Robotics, in Spain, builds a variety of Autonomous Mobile Robots (AMRs), which are deployed in diverse environments (e.g., warehouses, retail spaces, and offices), where they work alongside humans. Given that human behavior can be unpredictable and that AMRs may not have been trained to handle all possible unknown and uncertain behaviors, it is important to test AMRs under a wide range of human interactions to ensure their safe behavior. Moreover, testing in real environments with actual AMRs and humans is often costly, impractical, and potentially hazardous (e.g., it could result in human injury). To this end, we propose a Vision Language Model (VLM)-based testing approach (RVSG) for industrial AMRs developed together with PAL Robotics. Based on the functional and safety requirements, RVSG uses the VLM to generate diverse human behaviors that violate these requirements. We evaluated RVSG with several requirements and navigation routes in a simulator using the latest AMR from PAL Robotics. Our results show that, compared with the baseline, RVSG can effectively generate requirement-violating scenarios. Moreover, RVSG-generated scenarios increase variability in robot behavior, thereby helping reveal their uncertain behaviors.

Vision Language Model-based Testing of Industrial Autonomous Mobile Robots

TL;DR

This work proposes a Vision Language Model (VLM)-based testing approach (RVSG) for industrial AMRs developed together with PAL Robotics, and evaluates RVSG with several requirements and navigation routes in a simulator using the latest AMR from PAL Robotics.

Abstract

PAL Robotics, in Spain, builds a variety of Autonomous Mobile Robots (AMRs), which are deployed in diverse environments (e.g., warehouses, retail spaces, and offices), where they work alongside humans. Given that human behavior can be unpredictable and that AMRs may not have been trained to handle all possible unknown and uncertain behaviors, it is important to test AMRs under a wide range of human interactions to ensure their safe behavior. Moreover, testing in real environments with actual AMRs and humans is often costly, impractical, and potentially hazardous (e.g., it could result in human injury). To this end, we propose a Vision Language Model (VLM)-based testing approach (RVSG) for industrial AMRs developed together with PAL Robotics. Based on the functional and safety requirements, RVSG uses the VLM to generate diverse human behaviors that violate these requirements. We evaluated RVSG with several requirements and navigation routes in a simulator using the latest AMR from PAL Robotics. Our results show that, compared with the baseline, RVSG can effectively generate requirement-violating scenarios. Moreover, RVSG-generated scenarios increase variability in robot behavior, thereby helping reveal their uncertain behaviors.

Paper Structure

This paper contains 25 sections, 3 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: A representative autonomous mobile robot from PAL Robotics, TIAGo OMNI Base. It is planned to be equipped with the MAPLE-K loop to enable self-adaptation in uncertain and unknown environments. The phases of MAPLE-K highlighted in orange are those supported by RVSG.
  • Figure 2: Overview of RVSG during the iterative testing of autonomous mobile robots. A process decorated with involves a human operator; otherwise, it is fully automated.
  • Figure 3: Prompt templates, prompt field example, and scenario generation with feedback and memory in RVSG.
  • Figure 4: Warehouse simulation world and navigation routes for AMR testing. Circles labeled with "S" and "E" indicate the starting and end of the routes, respectively.
  • Figure 5: Distribution results of the best and worst scenarios (repeated 30 times) generated by RVSG with different requirements for different metrics under different navigation routes. The mean represents the central tendency. -- RQ3