Garbage Segmentation and Attribute Analysis by Robotic Dogs
Nuo Xu, Jianfeng Liao, Qiwei Meng, Wei Song
TL;DR
This work introduces GSA2Seg, a cloud-based, vision–language–driven framework that performs garbage segmentation and attribute analysis to inform robotic grasping in diverse environments. By fusing visual features with language prompts via bi-directional attention in an encoder–decoder, GSA2Seg predicts precise masks, bounding boxes, and object attributes, enabling state-aware manipulation. The authors contribute the GSA2D dataset, comprising 3119 images across 10 garbage types with attributes such as placement state (standing/lying/deformation) and position (ground/platform), to benchmark open-vocabulary attribute-aware segmentation. Experimental results show GSA2Seg achieving state-of-the-art AP and AP50 while maintaining competitive FPS, with ablations demonstrating the value of language prompts and attribute analysis for robustness and generalization in robotic waste management.
Abstract
Efficient waste management and recycling heavily rely on garbage exploration and identification. In this study, we propose GSA2Seg (Garbage Segmentation and Attribute Analysis), a novel visual approach that utilizes quadruped robotic dogs as autonomous agents to address waste management and recycling challenges in diverse indoor and outdoor environments. Equipped with advanced visual perception system, including visual sensors and instance segmentators, the robotic dogs adeptly navigate their surroundings, diligently searching for common garbage items. Inspired by open-vocabulary algorithms, we introduce an innovative method for object attribute analysis. By combining garbage segmentation and attribute analysis techniques, the robotic dogs accurately determine the state of the trash, including its position and placement properties. This information enhances the robotic arm's grasping capabilities, facilitating successful garbage retrieval. Additionally, we contribute an image dataset, named GSA2D, to support evaluation. Through extensive experiments on GSA2D, this paper provides a comprehensive analysis of GSA2Seg's effectiveness. Dataset available: \href{https://www.kaggle.com/datasets/hellob/gsa2d-2024}{https://www.kaggle.com/datasets/hellob/gsa2d-2024}.
