Synthesizing Trajectory Queries from Examples

Stephen Mell; Favyen Bastani; Steve Zdancewic; Osbert Bastani

Synthesizing Trajectory Queries from Examples

Stephen Mell, Favyen Bastani, Steve Zdancewic, Osbert Bastani

TL;DR

Quivr addresses the challenge of synthesizing trajectory queries with real-valued parameters from few examples by introducing a parameter-pruning framework based on a novel quantitative semantics. It defines a DSL for trajectory predicates with subsequence matching and enables multi-object queries via sequencing and iteration, while maintaining soundness and partial completeness. The approach yields substantial synthesis-time speedups, especially with GPU-accelerated evaluation, and achieves high accuracy on traffic, animal, and maritime trajectory tasks with minimal labeling. This work enables efficient, programmable querying of trajectory data with practical impact on autonomous driving, behavioral analysis, and surveillance.

Abstract

Data scientists often need to write programs to process predictions of machine learning models, such as object detections and trajectories in video data. However, writing such queries can be challenging due to the fuzzy nature of real-world data; in particular, they often include real-valued parameters that must be tuned by hand. We propose a novel framework called Quivr that synthesizes trajectory queries matching a given set of examples. To efficiently synthesize parameters, we introduce a novel technique for pruning the parameter space and a novel quantitative semantics that makes this more efficient. We evaluate Quivr on a benchmark of 17 tasks, including several from prior work, and show both that it can synthesize accurate queries for each task and that our optimizations substantially reduce synthesis time.

Synthesizing Trajectory Queries from Examples

TL;DR

Abstract

Paper Structure (27 sections, 7 theorems, 32 equations, 7 figures, 7 tables, 2 algorithms)

This paper contains 27 sections, 7 theorems, 32 equations, 7 figures, 7 tables, 2 algorithms.

Introduction
Overview
Query Language
Synthesis Algorithm
Problem Formulation
Algorithm Overview
Pruning with Boundary Parameters
Pruning with Pairs of Boundary Parameters
Pruning Parameter Search Algorithm
Computing Pruning Pairs via Quantitative Semantics
Implementation
Evaluation
Experimental Setup
Accuracy of Synthesized Queries
Synthesis Running Time
...and 12 more sections

Key Result

lemma thmcounterlemma

Given sketch $Q$, trajectory $z$, and two candidate parameters $\theta,\theta'\in\mathbb{R}^d$ such that $\theta\le\theta'$ component-wise, we have $\llbracket Q_\theta \rrbracket(z)\ge\llbracket Q_{\theta'} \rrbracket(z)$.

Figures (7)

Figure 1: (a) A video frame from a traffic camera, along with object trajectories (red) and manually annotated lanes (black). (b) The trajectories selected by the query (bottom), which selects cars turning at the intersection.
Figure 2: A single match (top) for the multi-object query (bottom) which captures one car, $A$, turning into a lane behind another car, $B$, that is in that lane. The trajectories change color from red to green as a function of time. As can be seen, the car making the right turn does so just after the car going straight passes through the intersection.
Figure 3: Satisfaction semantics of our query language; $z\in\mathcal{Z}$ is a trajectory of length $n$ and $\varphi\in\Phi$ are predicates. Iteration ($Q^k$) can be expressed as repeated sequencing.
Figure 4: (a) shows a boundary parameter, $\theta_1$, for $z_1$, and a region that is inconsistent with $z_1$ and can be pruned (red), as well as a region that is consistent with it (blue). (b) similarly shows a boundary parameter $\theta_0$ for $z_0$. (c) shows the pruning pair composed of $\theta_0$ and $\theta_1$, a region consistent with both (blue), and regions inconsistent with either (red). (d) is the same as (c), but if $\theta_0$ and $\theta_1$ swapped places. The labels $b_0$ through $b_8$ denote analogous boxes in (c) & (d). (e) shows how, if (d) were the result of the first step of search and $b_6$ were chosen next, search could proceed. (f) shows ground truth consistent (blue) and inconsistent (red) regions that the search process in (d) & (e) might converge toward.
Figure 5: The quantitative semantics of our language, taking a sketch $Q$, trajectory $z$, parameter $v \in \mathbb{R}^d$, and positive vector $u \in \mathbb{R}_{>0}^d$. $n$ is the length of $z$.
...and 2 more figures

Theorems & Definitions (12)

definition thmcounterdefinition
lemma thmcounterlemma
definition thmcounterdefinition
definition thmcounterdefinition
lemma thmcounterlemma
theorem thmcountertheorem
theorem thmcountertheorem
theorem thmcountertheorem
lemma thmcounterlemma
proof
...and 2 more

Synthesizing Trajectory Queries from Examples

TL;DR

Abstract

Synthesizing Trajectory Queries from Examples

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (12)