Table of Contents
Fetching ...

Querying Perception Streams with Spatial Regular Expressions

Jacob Anderson, Georgios Fainekos, Bardh Hoxha, Hideki Okamoto, Danil Prokhorov

TL;DR

This work introduces SpREs as a novel querying language for pattern matching over perception streams containing spatial and temporal data derived from multi-modal dynamic environments and developed the STREM tool as both an offline and online pattern matching framework for perception data.

Abstract

Perception in fields like robotics, manufacturing, and data analysis generates large volumes of temporal and spatial data to effectively capture their environments. However, sorting through this data for specific scenarios is a meticulous and error-prone process, often dependent on the application, and lacks generality and reproducibility. In this work, we introduce SpREs as a novel querying language for pattern matching over perception streams containing spatial and temporal data derived from multi-modal dynamic environments. To highlight the capabilities of SpREs, we developed the STREM tool as both an offline and online pattern matching framework for perception data. We demonstrate the offline capabilities of STREM through a case study on a publicly available AV dataset (Woven Planet Perception) and its online capabilities through a case study integrating STREM in ROS with the CARLA simulator. We also conduct performance benchmark experiments on various SpRE queries. Using our matching framework, we are able to find over 20,000 matches within 296 ms making STREM applicable in runtime monitoring applications.

Querying Perception Streams with Spatial Regular Expressions

TL;DR

This work introduces SpREs as a novel querying language for pattern matching over perception streams containing spatial and temporal data derived from multi-modal dynamic environments and developed the STREM tool as both an offline and online pattern matching framework for perception data.

Abstract

Perception in fields like robotics, manufacturing, and data analysis generates large volumes of temporal and spatial data to effectively capture their environments. However, sorting through this data for specific scenarios is a meticulous and error-prone process, often dependent on the application, and lacks generality and reproducibility. In this work, we introduce SpREs as a novel querying language for pattern matching over perception streams containing spatial and temporal data derived from multi-modal dynamic environments. To highlight the capabilities of SpREs, we developed the STREM tool as both an offline and online pattern matching framework for perception data. We demonstrate the offline capabilities of STREM through a case study on a publicly available AV dataset (Woven Planet Perception) and its online capabilities through a case study integrating STREM in ROS with the CARLA simulator. We also conduct performance benchmark experiments on various SpRE queries. Using our matching framework, we are able to find over 20,000 matches within 296 ms making STREM applicable in runtime monitoring applications.

Paper Structure

This paper contains 34 sections, 13 equations, 12 figures, 2 tables, 1 algorithm.

Figures (12)

  • Figure 1: [id=gf]An example of an a:av perception stack pipeline. In our running examples and case studies, the perception data may be received from: (1) segmentation data from a:lidar, (2) object annotations from images, (3) geo-spatial data, or (4) localization maps.
  • Figure 2: An example perception stream s:datastream, sourced from (2) in \ref{['fig:prelims:stream:0']}, containing the frames $_{0}, _{1}, _{2}$ of a camera sensor channel $\in$$c$ with imagepixel space $\replaced{}{}_{\replaced{}{c}}\added{\subseteq }$. For each object in a given frame, a classification and bounding box is minimally assumed to be attributedannotated. In addition, each frame may be augmented with other sensor data relevant to the system to provide further context such as , GPS, IMU, etc.
  • Figure 3: An example perception stream , sourced from (4) in \ref{['fig:prelims:stream:0']}, containing the frames $_{0}$, $_{1}$, $_{2}$ of a channel $\in$ with space $_{c} \subseteq ^{3}$. For each object in a given frame, a classification, rotation, and bounding box is minimally assumed to be attributed. In addition, the view may enable additional sensor information such as lanes, crosswalks, or bike lanes.
  • Figure 4: a:spre to a:dfa.
  • Figure 5: The architectural design of a:strem.
  • ...and 7 more figures

Theorems & Definitions (14)

  • Definition 1: $\bm{\glsentryname{s:s4+}}$ Syntax
  • Definition 2: $\bm{\glsentryname{s:s4m+}}$ Syntax
  • Definition 3: $\bm{\glsentryname{s:s4u+}}$ Syntax
  • Definition 4: a:spre Syntax
  • Remark 1: Alphabet
  • Example 1
  • Definition 5: $\bm{\glsentryname{s:s4+}}$ Semantics
  • Definition 6: $\bm{\glsentryname{s:s4m+}}$ Semantics
  • Definition 7: $\bm{\glsentryname{s:s4u+}}$ Semantics
  • Definition 8: a:spre Semantics
  • ...and 4 more