Object-Oriented Semantic Mapping for Reliable UAVs Navigation
Thanh Nguyen Canh, Armagan Elibol, Nak Young Chong, Xiem HoangVan
TL;DR
This work tackles safe UAV navigation in cluttered indoor environments by augmenting metric maps with semantic object information. It fuses RGB-D based object detection (YOLOv8) and tracking (BoT-SORT) with 2D SLAM (CartoGrapher) to build a probabilistic semantic map that encodes object class, position, and height, including a robust projection of hollow-bottom objects via RANSAC. A probabilistic semantic layer is integrated into the 2D costmap, enabling semantics-aware obstacle avoidance and planning. The approach runs in real time on embedded hardware (Jetson Xavier AGX) and demonstrates high object detection accuracy (~98% AP at IOU=0.5) and improved safety navigation around hollow objects, validated on an RGB-D-equipped UAV platform.
Abstract
To autonomously navigate in real-world environments, special in search and rescue operations, Unmanned Aerial Vehicles (UAVs) necessitate comprehensive maps to ensure safety. However, the prevalent metric map often lacks semantic information crucial for holistic scene comprehension. In this paper, we proposed a system to construct a probabilistic metric map enriched with object information extracted from the environment from RGB-D images. Our approach combines a state-of-the-art YOLOv8-based object detection framework at the front end and a 2D SLAM method - CartoGrapher at the back end. To effectively track and position semantic object classes extracted from the front-end interface, we employ the innovative BoT-SORT methodology. A novel association method is introduced to extract the position of objects and then project it with the metric map. Unlike previous research, our approach takes into reliable navigating in the environment with various hollow bottom objects. The output of our system is a probabilistic map, which significantly enhances the map's representation by incorporating object-specific attributes, encompassing class distinctions, accurate positioning, and object heights. A number of experiments have been conducted to evaluate our proposed approach. The results show that the robot can effectively produce augmented semantic maps containing several objects (notably chairs and desks). Furthermore, our system is evaluated within an embedded computer - Jetson Xavier AGX unit to demonstrate the use case in real-world applications.
