Table of Contents
Fetching ...

Assured Autonomy with Neuro-Symbolic Perception

R. Spencer Hallyburton, Miroslav Pajic

TL;DR

The paper addresses the lack of assured robustness in perception for cyber-physical systems by highlighting vulnerabilities of pattern-matching DNNs and introducing a neuro-symbolic perception framework, NeuSPaPer, that combines joint object detection with scene graph generation to enable reasoning over semantic relationships. It leverages foundation models for offline knowledge extraction and specialized SGG models for real-time inference, with per-sensor and cross-sensor integrity checks guided by physics-based knowledge. Feasibility studies on nuScenes and CARLA show that scene-graph-based integrity can detect previously stealthy attacks like frustum attacks, improving resilience in multi-sensor fusion. The contributions include a vulnerability analysis, a neuro-symbolic perception and integrity architecture, and an initial feasibility demonstration that motivates future full-stack neuro-symbolic perception research for trusted autonomy in CPS.

Abstract

Many state-of-the-art AI models deployed in cyber-physical systems (CPS), while highly accurate, are simply pattern-matchers.~With limited security guarantees, there are concerns for their reliability in safety-critical and contested domains. To advance assured AI, we advocate for a paradigm shift that imbues data-driven perception models with symbolic structure, inspired by a human's ability to reason over low-level features and high-level context. We propose a neuro-symbolic paradigm for perception (NeuSPaPer) and illustrate how joint object detection and scene graph generation (SGG) yields deep scene understanding.~Powered by foundation models for offline knowledge extraction and specialized SGG algorithms for real-time deployment, we design a framework leveraging structured relational graphs that ensures the integrity of situational awareness in autonomy. Using physics-based simulators and real-world datasets, we demonstrate how SGG bridges the gap between low-level sensor perception and high-level reasoning, establishing a foundation for resilient, context-aware AI and advancing trusted autonomy in CPS.

Assured Autonomy with Neuro-Symbolic Perception

TL;DR

The paper addresses the lack of assured robustness in perception for cyber-physical systems by highlighting vulnerabilities of pattern-matching DNNs and introducing a neuro-symbolic perception framework, NeuSPaPer, that combines joint object detection with scene graph generation to enable reasoning over semantic relationships. It leverages foundation models for offline knowledge extraction and specialized SGG models for real-time inference, with per-sensor and cross-sensor integrity checks guided by physics-based knowledge. Feasibility studies on nuScenes and CARLA show that scene-graph-based integrity can detect previously stealthy attacks like frustum attacks, improving resilience in multi-sensor fusion. The contributions include a vulnerability analysis, a neuro-symbolic perception and integrity architecture, and an initial feasibility demonstration that motivates future full-stack neuro-symbolic perception research for trusted autonomy in CPS.

Abstract

Many state-of-the-art AI models deployed in cyber-physical systems (CPS), while highly accurate, are simply pattern-matchers.~With limited security guarantees, there are concerns for their reliability in safety-critical and contested domains. To advance assured AI, we advocate for a paradigm shift that imbues data-driven perception models with symbolic structure, inspired by a human's ability to reason over low-level features and high-level context. We propose a neuro-symbolic paradigm for perception (NeuSPaPer) and illustrate how joint object detection and scene graph generation (SGG) yields deep scene understanding.~Powered by foundation models for offline knowledge extraction and specialized SGG algorithms for real-time deployment, we design a framework leveraging structured relational graphs that ensures the integrity of situational awareness in autonomy. Using physics-based simulators and real-world datasets, we demonstrate how SGG bridges the gap between low-level sensor perception and high-level reasoning, establishing a foundation for resilient, context-aware AI and advancing trusted autonomy in CPS.

Paper Structure

This paper contains 42 sections, 4 equations, 6 figures.

Figures (6)

  • Figure 1: Attacker can alter the semantic understanding of the scene while being stealthy to multi-sensor fusion. Translating existing 3D objects (denoted with white box) backwards or forwards (resulting in the detected 'moved' red boxes) from ego maintains consistency with 2D frustum in image plane. Attacker runs optimization to move object as far back as possible while retaining at least a minimum IoU (overlap) when projected into 2D image.
  • Figure 2: Neuro-symbolic paradigm for perception performs object detection, classification, and scene graph generation jointly, enabling context-based reasoning over e.g., spatial relationships from multi-modal data. Reasoning over the graphical models is informed by physics-based knowledge bases and happens both for each sensor and between sensors before impacting sensor fusion.
  • Figure 3: Scene pairs with below images. (a) BEV projection of LiDAR point cloud from nuScenes dataset shown with box detections. (b) Geometric rules build scene graphs using 3D boxes. Nodes (blue) connected via edge relations (red).
  • Figure 6: (a) Foundation model jointly detects objects and builds scene graph from image. (b,c) Perception yields rule-based scene graph from LiDAR. (c) Attacker translates Van away from the ego - attacked box when projected to camera is still consistent with 2D detections, so camera detections alone cannot detect the attack. (d) Graph-building lifts purely 2D camera data to relational 3D space by inferring positional relationships with context. Inconsistencies identified between camera and LiDAR graphs allow for attack detection of previously thought-to-be stealthy attacks.
  • Figure 7: Case study of using scene graph generation to secure multi-sensor fusion from attacks on sensing. Analysis procedure follows that of Figure \ref{['fig:carla-case-1']}.
  • ...and 1 more figures