Semantic Layering in Room Segmentation via LLMs

Taehyeon Kim; Byung-Cheol Min

Semantic Layering in Room Segmentation via LLMs

Taehyeon Kim, Byung-Cheol Min

TL;DR

Semantic Layering in Room Segmentation via LLMs (SeLRoS) is introduced, an advanced method for semantic room segmentation by integrating Large Language Models (LLMs) with traditional 2D map-based segmentation to enhance robotic navigation.

Abstract

In this paper, we introduce Semantic Layering in Room Segmentation via LLMs (SeLRoS), an advanced method for semantic room segmentation by integrating Large Language Models (LLMs) with traditional 2D map-based segmentation. Unlike previous approaches that solely focus on the geometric segmentation of indoor environments, our work enriches segmented maps with semantic data, including object identification and spatial relationships, to enhance robotic navigation. By leveraging LLMs, we provide a novel framework that interprets and organizes complex information about each segmented area, thereby improving the accuracy and contextual relevance of room segmentation. Furthermore, SeLRoS overcomes the limitations of existing algorithms by using a semantic evaluation method to accurately distinguish true room divisions from those erroneously generated by furniture and segmentation inaccuracies. The effectiveness of SeLRoS is verified through its application across 30 different 3D environments. Source code and experiment videos for this work are available at: https://sites.google.com/view/selros.

Semantic Layering in Room Segmentation via LLMs

TL;DR

Abstract

Paper Structure (20 sections, 1 equation, 6 figures, 2 tables, 1 algorithm)

This paper contains 20 sections, 1 equation, 6 figures, 2 tables, 1 algorithm.

Introduction
Related Works
Problem Formulation
Methodology
Geometric Room Segmentation
Object Mapping
Semantic Integration
Room Information Interpretation
Hierarchical Query
Experiments
Experimental Setup
Evaluation Criteria
Results and Analysis
Qualitative Analysis
Environment 1
...and 5 more sections

Figures (6)

Figure 1: Semantic Layering in Room Segmentation via LLMs (SeLRoS) employs a room segmentation algorithm, an object detection algorithm, and Large Language Models (LLMs) to derive a 2D segmentation map from a 3D environment (left side of the figure), as well as to produce semantic information for each segmented room (right side of the figure).
Figure 2: Overview of SeLRoS’s structure: SeLRoS begins with Geometric Room Segmentation, where a 2D map ($M$) from the Original Environment ($E$) is transformed into a Segmentation Map ($S$). Following this, the Object Mapping process extracts Object Information ($O_s$) by analyzing scenes from the Original Environment’s center coordinates of each segmented space ($s$), employing an Object Detection algorithm. In the Semantic Integration process, harmonizing $s$, $O_s$ and the data of spatial relations ($R_s$) through the Room Information Interpreter and generating prompts $P(s, O_s, R_s)$ via Hierarchical Query. The final outputs are Improved Segmentation Map ($S'$) with Semantic Information ($I$).
Figure 3: Hierarchical Query is hierarchically composed of Room-Level Query and Environment-Level Query. The red box represents the role component, the yellow box represents the instruction, and blue box signifies the set of Semantic Information.
Figure 4: Results for Environment 1 - (a) depicts the original 3D environment, (b) shows the segmentation map created using the Voronoi Random Field (VRF) algorithm, and (c) presents the improved segmentation map, the final result achieved through SeLRoS, with semantic information added for readability.
Figure 5: Results for Environment 2 - (a) depicts the original 3D environment, (b) shows the segmentation map created using the VRF algorithm, and (c) presents the improved segmentation map, the final result achieved through SeLRoS, with semantic information added for readability.
...and 1 more figures

Semantic Layering in Room Segmentation via LLMs

TL;DR

Abstract

Semantic Layering in Room Segmentation via LLMs

Authors

TL;DR

Abstract

Table of Contents

Figures (6)