Table of Contents
Fetching ...

Autonomous Surface Selection For Manipulator-Based UV Disinfection In Hospitals Using Foundation Models

Xueyan Oh, Jonathan Her, Zhixiang Ong, Brandon Koh, Yun Hann Tan, U-Xuan Tan

TL;DR

This work tackles the challenge of autonomously selecting surfaces for UV disinfection in hospital settings, where traditional lamp-based robots pose safety risks and surface definitions are labor-intensive. It introduces a pipeline that leverages foundation models to autonomously extract cleaning points, paired with a VLM-assisted segmentation refinement to exclude thin and small non-target objects, achieving more than $92\%$ segmentation success. The solution comprises perception, cleaning-point selection, and planning/execution modules, producing 3D surface points and buffer zones for safe disinfection, and is demonstrated on a UR10e manipulator with simulated UV illumination. By removing the need for fine-tuning and minimizing operator input, the approach offers a practical path toward scalable, safe robotic UV disinfection in clinical environments, while outlining limitations and avenues for future improvement.

Abstract

Ultraviolet (UV) germicidal radiation is an established non-contact method for surface disinfection in medical environments. Traditional approaches require substantial human intervention to define disinfection areas, complicating automation, while deep learning-based methods often need extensive fine-tuning and large datasets, which can be impractical for large-scale deployment. Additionally, these methods often do not address scene understanding for partial surface disinfection, which is crucial for avoiding unintended UV exposure. We propose a solution that leverages foundation models to simplify surface selection for manipulator-based UV disinfection, reducing human involvement and removing the need for model training. Additionally, we propose a VLM-assisted segmentation refinement to detect and exclude thin and small non-target objects, showing that this reduces mis-segmentation errors. Our approach achieves over 92\% success rate in correctly segmenting target and non-target surfaces, and real-world experiments with a manipulator and simulated UV light demonstrate its practical potential for real-world applications.

Autonomous Surface Selection For Manipulator-Based UV Disinfection In Hospitals Using Foundation Models

TL;DR

This work tackles the challenge of autonomously selecting surfaces for UV disinfection in hospital settings, where traditional lamp-based robots pose safety risks and surface definitions are labor-intensive. It introduces a pipeline that leverages foundation models to autonomously extract cleaning points, paired with a VLM-assisted segmentation refinement to exclude thin and small non-target objects, achieving more than segmentation success. The solution comprises perception, cleaning-point selection, and planning/execution modules, producing 3D surface points and buffer zones for safe disinfection, and is demonstrated on a UR10e manipulator with simulated UV illumination. By removing the need for fine-tuning and minimizing operator input, the approach offers a practical path toward scalable, safe robotic UV disinfection in clinical environments, while outlining limitations and avenues for future improvement.

Abstract

Ultraviolet (UV) germicidal radiation is an established non-contact method for surface disinfection in medical environments. Traditional approaches require substantial human intervention to define disinfection areas, complicating automation, while deep learning-based methods often need extensive fine-tuning and large datasets, which can be impractical for large-scale deployment. Additionally, these methods often do not address scene understanding for partial surface disinfection, which is crucial for avoiding unintended UV exposure. We propose a solution that leverages foundation models to simplify surface selection for manipulator-based UV disinfection, reducing human involvement and removing the need for model training. Additionally, we propose a VLM-assisted segmentation refinement to detect and exclude thin and small non-target objects, showing that this reduces mis-segmentation errors. Our approach achieves over 92\% success rate in correctly segmenting target and non-target surfaces, and real-world experiments with a manipulator and simulated UV light demonstrate its practical potential for real-world applications.

Paper Structure

This paper contains 13 sections, 7 figures, 2 tables, 1 algorithm.

Figures (7)

  • Figure 1: Summary of the limitations in existing approaches and our proposed approach for autonomous surface selection and target point generation for manipulator-based UV disinfection in hospitals.
  • Figure 2: Overview of our proposed pipeline that leverages foundation models for autonomous perception and selection of cleaning points for manipulator-based UV disinfection of high-touch surfaces in hospitals. The pipeline consists of (1) A Perception module that passes an RGB image twice into Grounded SAM---first with a user-defined prompt to segment target surfaces, and second with a fixed prompt to detect thin and small objects for our proposed VLM-assisted refinement of the non-target mask, reducing mis-segmentation errors in the target mask. The resulting masks are aligned with the depth map to extract point clouds; (2) A Cleaning Points Selection module processes these point clouds using a series of filters to select the final surface points for UV disinfection; 3) The Planning and Execution module can use these selected points to generate waypoints and the path and motion a manipulator. Modules 1 and 2 in blue represent our main contributions while module 3 is included for completeness.
  • Figure 3: Applying erosion using a 10x10 kernel of ones to the initial target mask (b) effectively reduces noise but mis-segmentation errors are still present (c). A 20x20 kernel is required to sufficiently filter noise from the inverted target mask (d) which represents non-target surfaces, but this often leads to the loss of fine features (e) such as an oxygen tube. We address this by obtaining a fine-feature mask using a separate prompt designed to detect thin and small objects for segmentation (f). The output masks are then combined to obtain a fine-feature mask (g), which is then merged with the filtered inverted target mask to obtain the final non-target mask (h).
  • Figure 4: Experimental setup featuring a UR10e manipulator with an RGB-D camera and a custom UV light-emitting end effector. In each experiment, the manipulator is positioned 40–70 cm from the nearest point of the target object, with the end effector initially oriented toward the target.
  • Figure 5: Sample images, segmentation results in green (target mask) and red (non-target mask), and scores given for successful segmentation of target surfaces (T) and exclusion of non-target objects (NT, where 1/3 refers to 1 out of 3 objects successfully excluded). Combining our proposed non-target mask with the target mask effectively reduces mis-segmentation errors in the target mask (red arrows). Our pipeline is generally robust to various combinations of target surfaces and randomly placed non-target objects, even under high colour similarity. Possible failure modes include segmenting the geriatric chair's armrest as a target surface (geriatric chair, right) and failing to exclude the base of the disinfectant holder (bed railing, left).
  • ...and 2 more figures