Table of Contents
Fetching ...

PANav: Toward Privacy-Aware Robot Navigation via Vision-Language Models

Bangguo Yu, Hamidreza Kasaei, Ming Cao

TL;DR

A new framework for mobile robot navigation that leverages vision-language models to incorporate privacy awareness into adaptive path planning and demonstrates the practical applicability of the framework by successfully navigating a robotic platform through real-world office environments.

Abstract

Navigating robots discreetly in human work environments while considering the possible privacy implications of robotic tasks presents significant challenges. Such scenarios are increasingly common, for instance, when robots transport sensitive objects that demand high levels of privacy in spaces crowded with human activities. While extensive research has been conducted on robotic path planning and social awareness, current robotic systems still lack the functionality of privacy-aware navigation in public environments. To address this, we propose a new framework for mobile robot navigation that leverages vision-language models to incorporate privacy awareness into adaptive path planning. Specifically, all potential paths from the starting point to the destination are generated using the A* algorithm. Concurrently, the vision-language model is used to infer the optimal path for privacy-awareness, given the environmental layout and the navigational instruction. This approach aims to minimize the robot's exposure to human activities and preserve the privacy of the robot and its surroundings. Experimental results on the S3DIS dataset demonstrate that our framework significantly enhances mobile robots' privacy awareness of navigation in human-shared public environments. Furthermore, we demonstrate the practical applicability of our framework by successfully navigating a robotic platform through real-world office environments. The supplementary video and code can be accessed via the following link: https://sites.google.com/view/privacy-aware-nav.

PANav: Toward Privacy-Aware Robot Navigation via Vision-Language Models

TL;DR

A new framework for mobile robot navigation that leverages vision-language models to incorporate privacy awareness into adaptive path planning and demonstrates the practical applicability of the framework by successfully navigating a robotic platform through real-world office environments.

Abstract

Navigating robots discreetly in human work environments while considering the possible privacy implications of robotic tasks presents significant challenges. Such scenarios are increasingly common, for instance, when robots transport sensitive objects that demand high levels of privacy in spaces crowded with human activities. While extensive research has been conducted on robotic path planning and social awareness, current robotic systems still lack the functionality of privacy-aware navigation in public environments. To address this, we propose a new framework for mobile robot navigation that leverages vision-language models to incorporate privacy awareness into adaptive path planning. Specifically, all potential paths from the starting point to the destination are generated using the A* algorithm. Concurrently, the vision-language model is used to infer the optimal path for privacy-awareness, given the environmental layout and the navigational instruction. This approach aims to minimize the robot's exposure to human activities and preserve the privacy of the robot and its surroundings. Experimental results on the S3DIS dataset demonstrate that our framework significantly enhances mobile robots' privacy awareness of navigation in human-shared public environments. Furthermore, we demonstrate the practical applicability of our framework by successfully navigating a robotic platform through real-world office environments. The supplementary video and code can be accessed via the following link: https://sites.google.com/view/privacy-aware-nav.
Paper Structure (20 sections, 5 equations, 6 figures, 2 tables)

This paper contains 20 sections, 5 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Privacy-aware navigation example. The robot uses vision language models to find a more private path (shown in green) based on the environment map and the navigational instructions.
  • Figure 2: The architecture of the privacy-aware navigation framework. The framework takes a point cloud map as input to generate top-view and topological maps, and an optimal path is selected from all potential paths based on the privacy-aware inference of the vision-language models. Once the optimal path is selected, the robot can follow the path to complete the navigation task.
  • Figure 3: A case of the top-5 possible paths for transporting a classified file from an office to a conference room in Area_5a. The red lines represent the paths extracted from the topological map.
  • Figure 4: Gaussian-modulated distance field based on the traversability map.
  • Figure 5: The 3D scan point cloud of the real office scene includes three office rooms, a conference room, and several corridors.
  • ...and 1 more figures