Table of Contents
Fetching ...

xFLIE: Leveraging Actionable Hierarchical Scene Representations for Autonomous Semantic-Aware Inspection Missions

Vignesh Kottayam Viswanathan, Mario A. V. Saucedo, Sumeet Gajanan Satpute, Christoforos Kanellakis, George Nikolakopoulos

TL;DR

This work introduces the $3DLSG$, a real-time, four-layer hierarchical scene graph for autonomous semantic inspection, and the integrated xFLIE framework that couples $3DLSG$ construction with the FLIE planner. The approach enables incremental scene understanding, target selection, and hierarchical path planning in unknown environments, achieving substantial reductions in planning time compared with voxel-based maps while maintaining inspection effectiveness. Validation through extensive simulations and field trials demonstrates scalable planning over large-scale environments and robust semantic navigation in outdoor and subterranean settings. The results indicate that leveraging a persistent, interpretable scene representation significantly enhances autonomy, planning efficiency, and semantic reasoning for inspection missions that require interaction with human operators and complex targets.

Abstract

We present a novel architecture aimed towards incremental construction and exploitation of a hierarchical 3D scene graph representation during semantic-aware inspection missions. Inspection planning, particularly of distributed targets in previously unseen environments, presents an opportunity to exploit the semantic structure of the scene during reasoning, navigation and scene understanding. Motivated by this, we propose the 3D Layered Semantic Graph (3DLSG), a hierarchical inspection scene graph constructed in an incremental manner and organized into abstraction layers that support planning demands in real-time. To address the task of semantic-aware inspection, a mission framework, termed as Enhanced First-Look Inspect Explore (xFLIE), that tightly couples the 3DLSG with an inspection planner is proposed. We assess the performance through simulations and experimental trials, evaluating target-selection, path-planning and semantic navigation tasks over the 3DLSG model. The scenarios presented are diverse, ranging from city-scale distributed to solitary infrastructure targets in simulated worlds and subsequent outdoor and subterranean environment deployments onboard a quadrupedal robot. The proposed method successfully demonstrates incremental construction and planning over the 3DLSG representation to meet the objectives of the missions. Furthermore, the framework demonstrates successful semantic navigation tasks over the structured interface at the end of the inspection missions. Finally, we report multiple orders of magnitude reduction in path-planning time compared to conventional volumetric-map-based methods over various environment scale, demonstrating the planning efficiency and scalability of the proposed approach.

xFLIE: Leveraging Actionable Hierarchical Scene Representations for Autonomous Semantic-Aware Inspection Missions

TL;DR

This work introduces the , a real-time, four-layer hierarchical scene graph for autonomous semantic inspection, and the integrated xFLIE framework that couples construction with the FLIE planner. The approach enables incremental scene understanding, target selection, and hierarchical path planning in unknown environments, achieving substantial reductions in planning time compared with voxel-based maps while maintaining inspection effectiveness. Validation through extensive simulations and field trials demonstrates scalable planning over large-scale environments and robust semantic navigation in outdoor and subterranean settings. The results indicate that leveraging a persistent, interpretable scene representation significantly enhances autonomy, planning efficiency, and semantic reasoning for inspection missions that require interaction with human operators and complex targets.

Abstract

We present a novel architecture aimed towards incremental construction and exploitation of a hierarchical 3D scene graph representation during semantic-aware inspection missions. Inspection planning, particularly of distributed targets in previously unseen environments, presents an opportunity to exploit the semantic structure of the scene during reasoning, navigation and scene understanding. Motivated by this, we propose the 3D Layered Semantic Graph (3DLSG), a hierarchical inspection scene graph constructed in an incremental manner and organized into abstraction layers that support planning demands in real-time. To address the task of semantic-aware inspection, a mission framework, termed as Enhanced First-Look Inspect Explore (xFLIE), that tightly couples the 3DLSG with an inspection planner is proposed. We assess the performance through simulations and experimental trials, evaluating target-selection, path-planning and semantic navigation tasks over the 3DLSG model. The scenarios presented are diverse, ranging from city-scale distributed to solitary infrastructure targets in simulated worlds and subsequent outdoor and subterranean environment deployments onboard a quadrupedal robot. The proposed method successfully demonstrates incremental construction and planning over the 3DLSG representation to meet the objectives of the missions. Furthermore, the framework demonstrates successful semantic navigation tasks over the structured interface at the end of the inspection missions. Finally, we report multiple orders of magnitude reduction in path-planning time compared to conventional volumetric-map-based methods over various environment scale, demonstrating the planning efficiency and scalability of the proposed approach.
Paper Structure (22 sections, 17 equations, 18 figures, 6 tables)

This paper contains 22 sections, 17 equations, 18 figures, 6 tables.

Figures (18)

  • Figure 1: An overview of the inspection scene graph constructed during the semantic-aware inspection mission within a simulated city-scale environment. The middle subfigures illustrate the planning avenues explored during the mission. This includes decision-making and hierarchical path planning over the incremental scene graph representation during autonomous inspection. The rightmost subfigure highlights semantic navigation tasks based on a query from a human operator using the structured scene graph.
  • Figure 2: Example of a 3D Layered Scene Graph (3DLSG) constructed by the xFLIE architecture during a building inspection scenario. The system segments and encodes semantic features such as person and closed-window within the volumetric map oleynikova2017voxblox, resulting in a unified multi-layer representation of the scene.
  • Figure 3: A schematic representation of the internal composition of the 3DLSG.
  • Figure 4: The proposed architecture xFLIE which integrates FLIE, an inspection and exploration planner, with the 3DLSG, an actionable hierarchical scene representation, for semantic-aware inspection in unknown environments. xFLIE uses RGB, Depth and LiDAR measurements (shown on top-left) to segment and localize desired semantics during inspection and exploration. The bifurcated FLIE planner populates the corresponding layers, (a) $\textit{Target}$ layer during exploration and (b) $\textit{Level},\textit{Pose},\textit{Feature}$ layers during inspection, within the 3DLSG and outputs (c) an actionable hierarchical scene representation which is used for addressing planning and semantic queries.
  • Figure 5: An overview of the optimization process of Target layer graph after the exploration phase.
  • ...and 13 more figures