Table of Contents
Fetching ...

Spatialyze: A Geospatial Video Analytics System with Spatial-Aware Optimizations

Chanwut Kittivorawong, Yongming Ge, Yousef Helal, Alvin Cheung

TL;DR

Spatialyze tackles the lack of end-to-end geospatial video analytics by introducing S-Flow, a Python-embedded DSL that expresses geospatial workflows in a build-filter-observe paradigm. The system defers heavy ML inferences until observation, and employs four geo-aware optimizations—Road Visibility Pruner, Object Type Pruner, Geometry-Based 3D Location Estimator, and Exit Frame Sampler—tied to a streaming execution model (Data Integrator, Video Processor, Movable Objects Query Engine, Output Composer). Evaluations on real datasets (nuScenes, VIVA, SkyQuery) show substantial speedups (up to $7.3\times$ in some cases) with high accuracy (up to $97.1\%$) and favorable ablation results across the optimizations. The work demonstrates that leveraging geospatial metadata and inherited object behaviors can significantly accelerate end-to-end geospatial video analytics, enabling scalable analysis of large video corpora for journalism, surveillance, and AV applications.

Abstract

Videos that are shot using commodity hardware such as phones and surveillance cameras record various metadata such as time and location. We encounter such geospatial videos on a daily basis and such videos have been growing in volume significantly. Yet, we do not have data management systems that allow users to interact with such data effectively. In this paper, we describe Spatialyze, a new framework for end-to-end querying of geospatial videos. Spatialyze comes with a domain-specific language where users can construct geospatial video analytic workflows using a 3-step, declarative, build-filter-observe paradigm. Internally, Spatialyze leverages the declarative nature of such workflows, the temporal-spatial metadata stored with videos, and physical behavior of real-world objects to optimize the execution of workflows. Our results using real-world videos and workflows show that Spatialyze can reduce execution time by up to 5.3x, while maintaining up to 97.1% accuracy compared to unoptimized execution.

Spatialyze: A Geospatial Video Analytics System with Spatial-Aware Optimizations

TL;DR

Spatialyze tackles the lack of end-to-end geospatial video analytics by introducing S-Flow, a Python-embedded DSL that expresses geospatial workflows in a build-filter-observe paradigm. The system defers heavy ML inferences until observation, and employs four geo-aware optimizations—Road Visibility Pruner, Object Type Pruner, Geometry-Based 3D Location Estimator, and Exit Frame Sampler—tied to a streaming execution model (Data Integrator, Video Processor, Movable Objects Query Engine, Output Composer). Evaluations on real datasets (nuScenes, VIVA, SkyQuery) show substantial speedups (up to in some cases) with high accuracy (up to ) and favorable ablation results across the optimizations. The work demonstrates that leveraging geospatial metadata and inherited object behaviors can significantly accelerate end-to-end geospatial video analytics, enabling scalable analysis of large video corpora for journalism, surveillance, and AV applications.

Abstract

Videos that are shot using commodity hardware such as phones and surveillance cameras record various metadata such as time and location. We encounter such geospatial videos on a daily basis and such videos have been growing in volume significantly. Yet, we do not have data management systems that allow users to interact with such data effectively. In this paper, we describe Spatialyze, a new framework for end-to-end querying of geospatial videos. Spatialyze comes with a domain-specific language where users can construct geospatial video analytic workflows using a 3-step, declarative, build-filter-observe paradigm. Internally, Spatialyze leverages the declarative nature of such workflows, the temporal-spatial metadata stored with videos, and physical behavior of real-world objects to optimize the execution of workflows. Our results using real-world videos and workflows show that Spatialyze can reduce execution time by up to 5.3x, while maintaining up to 97.1% accuracy compared to unoptimized execution.
Paper Structure (51 sections, 7 equations, 5 figures, 1 table)

This paper contains 51 sections, 7 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: (a) Geospatial Video Analytics Workflow Execution. (b) Processing Plan for the workflow shown in \ref{['listing:python-workflow']}.
  • Figure 2: (a) A 3D viewable space of a camera in a pyramid shape and its projected area of the pyramid onto the $z=0$ plane. (b) In \ref{['listing:python-workflow']}, the video processor executes expensive ML models only on video frames with a visible intersection.
  • Figure 3: From a car's 2D location ($\star$) to its 3D location ($\star$).
  • Figure 4: (a) Sample Events exitsLane (i) and exitsCamera (ii). (b) Example Results of our Sampling Algorithm. (c) F1 score and runtime reduction for each skip distance when the Exit Frame Sampler can skip at least 1 frame.
  • Figure 5: (a) Video frames processed per second for each system. (b) Average video processing time for each video of 20 seconds, comparing each optimization technique. (c) AssA against (SB) vs. other experiment setups.