Differentiable-Optimization Based Neural Policy for Occlusion-Aware Target Tracking

Houman Masnavi; Arun Kumar Singh; Farrokh Janabi-Sharifi

Differentiable-Optimization Based Neural Policy for Occlusion-Aware Target Tracking

Houman Masnavi, Arun Kumar Singh, Farrokh Janabi-Sharifi

TL;DR

A learned probabilistic neural policy for safe, occlusion-free target tracking that combines generative modeling based on Conditional Variational Autoencoder with differentiable optimization layers and improves the state-of-the-art SOTA in the following respects.

Abstract

Tracking a target in cluttered and dynamic environments is challenging but forms a core component in applications like aerial cinematography. The obstacles in the environment not only pose collision risk but can also occlude the target from the field-of-view of the robot. Moreover, the target future trajectory may be unknown and only its current state can be estimated. In this paper, we propose a learned probabilistic neural policy for safe, occlusion-free target tracking. The core novelty of our work stems from the structure of our policy network that combines generative modeling based on Conditional Variational Autoencoder (CVAE) with differentiable optimization layers. The role of the CVAE is to provide a base trajectory distribution which is then projected onto a learned feasible set through the optimization layer. Furthermore, both the weights of the CVAE network and the parameters of the differentiable optimization can be learned in an end-to-end fashion through demonstration trajectories. We improve the state-of-the-art (SOTA) in the following respects. We show that our learned policy outperforms existing SOTA in terms of occlusion/collision avoidance capabilities and computation time. Second, we present an extensive ablation showing how different components of our learning pipeline contribute to the overall tracking task. We also demonstrate the real-time performance of our approach on resource-constrained hardware such as NVIDIA Jetson TX2. Finally, our learned policy can also be viewed as a reactive planner for navigation in highly cluttered environments.

Differentiable-Optimization Based Neural Policy for Occlusion-Aware Target Tracking

TL;DR

Abstract

Paper Structure (24 sections, 18 equations, 6 figures, 4 tables)

This paper contains 24 sections, 18 equations, 6 figures, 4 tables.

Introduction
Problem Formulation
Optimization for Target Tracking
Polynomial Parameterization
Main Algorithmic Results
CVAE with Differentiable Optimization Layers
Differentiation Through the Optimization Layer
Connections to Prior Works
Validation and Benchmarking
Implementation Details
CVAE Training
Baselines
Metrics
Target Tracking in Static Environments
Tracking with a maximum target speed of 1m/s
...and 9 more sections

Figures (6)

Figure 1: (a) and (b) show top-down views of regular target tracking vs. occlusion-aware target tracking, respectively. When there is no occlusion requirement (a), the robot tracks the target while avoiding the obstacle from the bottom. However, when the occlusion requirement is added, it forces the robot to avoid the obstacle from the top to keep its line of sight unobstructed.
Figure 2: Our learning-based approach for solving \ref{['cost_reform']}-\ref{['ineq_reform']} that relies on sampling trajectory coefficients $\boldsymbol{\xi}_j$ from a learned distribution conditioned on the observations (point clouds, states). The samples $\boldsymbol{\xi}_j$ are sorted based on their associated cost and the one with the least value is selected as the optimal solution. To ensure that sampled $\boldsymbol{\xi}_j$ lead to safe and kinematically feasible trajectories, the learned distribution is structured in the form of a CVAE augmented with a differentiable projection optimizer.
Figure 3: Proposed CVAE architecture augmented with a differentiable optimization layer. We use PointNet to encode point-clouds to some latent features as a part of the conditioning of the CVAE.
Figure 4: The unrolled structure of our differentiable projection optimizer. It includes solving a sequence of linear systems. We can backpropagate through the optimizer iterations to compute how ${^K}\boldsymbol{\xi}$ would change with respect to the CVAE decoder output $\overline{\boldsymbol{\xi}}, {^0}\boldsymbol{\xi}, {^0}\boldsymbol{\lambda}, \mathbf{q}$.
Figure 5: Target tracking results for AutoChaserauto_chaser_1 (a), Proj-CEMral_vis_aware (b), conventional Behavior cloning (BC) (c) and our proposed approach (d), respectively. The environment consists of 6 obstacles and the target is moving with the max speed of 1m/s. AutoChaser gets occluded around each obstacle, while Proj-CEMral_vis_aware and the conventional BC perform much better and only has minor occlusion around one of the obstacles. Our proposed method finishes the task successfully without any occlusion of the target throughout the whole run.
...and 1 more figures

Differentiable-Optimization Based Neural Policy for Occlusion-Aware Target Tracking

TL;DR

Abstract

Differentiable-Optimization Based Neural Policy for Occlusion-Aware Target Tracking

Authors

TL;DR

Abstract

Table of Contents

Figures (6)