Table of Contents
Fetching ...

NEUSIS: A Compositional Neuro-Symbolic Framework for Autonomous Perception, Reasoning, and Planning in Complex UAV Search Missions

Zhixi Cai, Cristian Rojas Cardenas, Kevin Leo, Chenyuan Zhang, Kal Backman, Hanbing Li, Boying Li, Mahsa Ghorbanali, Stavya Datta, Lizhen Qu, Julian Gutierrez Santiago, Alexey Ignatiev, Yuan-Fang Li, Mor Vered, Peter J Stuckey, Maria Garcia de la Banda, Hamid Rezatofighi

TL;DR

NEUSIS addresses autonomous UAV search missions by locating EOIs under time constraints in hazard-prone environments. It introduces a compositional neuro-symbolic pipeline (GRiD for $3$D perception-grounding-reasoning, a Probabilistic World Model, and SNaC for hierarchical planning) that maintains a persistent belief about the environment. Experiments in HAMERITT AirSim simulations show NEUSIS outperforms a state-of-the-art vision-language baseline and a state-of-the-art planning baseline in success rate, navigation efficiency, and 3D localization. The work demonstrates the value of explicit visual reasoning, probabilistic world modeling, and multi-level planning for robust UAV search under KOZs and time limits, with practical potential for search-and-rescue and related hazardous-domain applications.

Abstract

This paper addresses the problem of autonomous UAV search missions, where a UAV must locate specific Entities of Interest (EOIs) within a time limit, based on brief descriptions in large, hazard-prone environments with keep-out zones. The UAV must perceive, reason, and make decisions with limited and uncertain information. We propose NEUSIS, a compositional neuro-symbolic system designed for interpretable UAV search and navigation in realistic scenarios. NEUSIS integrates neuro-symbolic visual perception, reasoning, and grounding (GRiD) to process raw sensory inputs, maintains a probabilistic world model for environment representation, and uses a hierarchical planning component (SNaC) for efficient path planning. Experimental results from simulated urban search missions using AirSim and Unreal Engine show that NEUSIS outperforms a state-of-the-art (SOTA) vision-language model and a SOTA search planning model in success rate, search efficiency, and 3D localization. These results demonstrate the effectiveness of our compositional neuro-symbolic approach in handling complex, real-world scenarios, making it a promising solution for autonomous UAV systems in search missions.

NEUSIS: A Compositional Neuro-Symbolic Framework for Autonomous Perception, Reasoning, and Planning in Complex UAV Search Missions

TL;DR

NEUSIS addresses autonomous UAV search missions by locating EOIs under time constraints in hazard-prone environments. It introduces a compositional neuro-symbolic pipeline (GRiD for D perception-grounding-reasoning, a Probabilistic World Model, and SNaC for hierarchical planning) that maintains a persistent belief about the environment. Experiments in HAMERITT AirSim simulations show NEUSIS outperforms a state-of-the-art vision-language baseline and a state-of-the-art planning baseline in success rate, navigation efficiency, and 3D localization. The work demonstrates the value of explicit visual reasoning, probabilistic world modeling, and multi-level planning for robust UAV search under KOZs and time limits, with practical potential for search-and-rescue and related hazardous-domain applications.

Abstract

This paper addresses the problem of autonomous UAV search missions, where a UAV must locate specific Entities of Interest (EOIs) within a time limit, based on brief descriptions in large, hazard-prone environments with keep-out zones. The UAV must perceive, reason, and make decisions with limited and uncertain information. We propose NEUSIS, a compositional neuro-symbolic system designed for interpretable UAV search and navigation in realistic scenarios. NEUSIS integrates neuro-symbolic visual perception, reasoning, and grounding (GRiD) to process raw sensory inputs, maintains a probabilistic world model for environment representation, and uses a hierarchical planning component (SNaC) for efficient path planning. Experimental results from simulated urban search missions using AirSim and Unreal Engine show that NEUSIS outperforms a state-of-the-art (SOTA) vision-language model and a SOTA search planning model in success rate, search efficiency, and 3D localization. These results demonstrate the effectiveness of our compositional neuro-symbolic approach in handling complex, real-world scenarios, making it a promising solution for autonomous UAV systems in search missions.
Paper Structure (21 sections, 5 figures, 4 tables)

This paper contains 21 sections, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Overview of NEUSIS. Neuro-symbolic Perception, Grounding, Reasoning in 3D (GRiD); Symbolic Probabilistic World Model; and Selection, Navigation and Coverage (SNaC) components autonomously complete UAV search missions by processing sensor inputs to find targets, such as the red sedan required by the mission description.
  • Figure 2: Screenshots from the Neighborhood environment illustrating different real-world challenges for UAVs.
  • Figure 3: An example mission scenario with four EOIs spread across three AOIs. Prior likelihood of EOI presence is shown in the bottom left corner of each AOI.
  • Figure 4: The pipeline of our proposed neuro-symbolic system, NEUSIS. The UAV operates in a simulated environment (AirSim) and is equipped with sensors including RGB camera, depth camera, and GPS. The Perception, Grounding, Reasoning in 3D (GRiD) component processes sensor data using a reasoner (code generator) and Vision Foundation Models (VFMs), including neuro-based segmentation, object detection, property classification, and symbolic 2D tracker and 3D projector, to generate predictions. Predictions are sent to the world model, which maintains a belief map, and generates target reports. The Selection, Navigation and Coverage (SNaC) component generates a hierarchical plan, with the AOI Selection, AOI Navigation, and Area Coverage modules producing high-level, mid-level, and low-level plans.
  • Figure 5: Comparison of (a) our proposed system and (b) the baseline method on the scenario depicted in Figure \ref{['fig:missionVisualisation']}. Filled, colored shapes denote EOI reports, and blue curves represent the UAV's flight path.