Multi-Robot Planning for Filming Groups of Moving Actors Leveraging Submodularity and Pixel Density

Skyler Hughes; Rebecca Martin; Micah Corah; Sebastian Scherer

Multi-Robot Planning for Filming Groups of Moving Actors Leveraging Submodularity and Pixel Density

Skyler Hughes, Rebecca Martin, Micah Corah, Sebastian Scherer

TL;DR

This work model actors as moving polyhedra and compute approximate pixel densities for each face and camera view and proposes an objective that exhibits diminishing returns as pixel densities increase from repeated observation, resulting in a multi-robot perception planning problem that is solved via a combination of value iteration and greedy submodular maximization.

Abstract

Observing and filming a group of moving actors with a team of aerial robots is a challenging problem that combines elements of multi-robot coordination, coverage, and view planning. A single camera may observe multiple actors at once, and a robot team may observe individual actors from multiple views. As actors move about, groups may split, merge, and reform, and robots filming these actors should be able to adapt smoothly to such changes in actor formations. Rather than adopt an approach based on explicit formations or assignments, we propose an approach based on optimizing views directly. We model actors as moving polyhedra and compute approximate pixel densities for each face and camera view. Then, we propose an objective that exhibits diminishing returns as pixel densities increase from repeated observation. This gives rise to a multi-robot perception planning problem that we solve via a combination of value iteration and greedy submodular maximization. We evaluate our approach on challenging scenarios modeled after various social behaviors and featuring different numbers of robots and actors and observe that robot assignments and formations arise implicitly given the movements of groups of actors. Simulation results demonstrate that our approach consistently outperforms baselines, and in addition to performing well with the planner's approximation of pixel densities our approach also performs comparably for evaluation based on rendered views. Overall, the multi-round variant of the sequential planner we propose meets (within 1%) or exceeds formation and assignment baselines in all scenarios.

Multi-Robot Planning for Filming Groups of Moving Actors Leveraging Submodularity and Pixel Density

TL;DR

Abstract

Paper Structure (36 sections, 5 theorems, 17 equations, 5 figures, 1 table)

This paper contains 36 sections, 5 theorems, 17 equations, 5 figures, 1 table.

Introduction
Related work
Contributions
Background
Submodularity and monotonicity
Submodular optimization for multi-robot coordination
Problem formulation
Motion model
Actor motion and representation
Camera and sensor model
Assumptions
Objective function
Planning approach
Single robot planning
Sequential planning and coordination
...and 21 more sections

Key Result

Theorem 1

The SRPPA objective $g^\mathrm{obj}$ from eq:objective is normalized, monotonic, and submodular. Moreover, SRPPA satisfies alternating monotonicity conditions and is $n$-increasing for odd values of $n$ or else $n$-decreasing if even.

Figures (5)

Figure 1: A team of robots work together to film an unstructured scene with multiple moving actors splitting and merging.
Figure 2: Scene representation: Each actor $a\in{\mathcal{A}}$ is modeled as a hexagonal prism and follows a trajectory $Y_a$ in the plane. Robots move on a grid at height $d_h$ and carry a camera with field-of-view ${\gamma}$ and declination ${\phi}$.
Figure 3: Summary of experiment scenarios: Each scenario is shown from a top down view with actors as magenta hexagons. The actor paths are demarcated by black lines, and dots mark the starting positions. ${N^\mathrm{r}}$ and ${N^\mathrm{a}}$ refer to the number of robots, and the number of actors respectively. All actors (and faces) have the same priority $w_f\!=\!1$ (see \ref{['eq:view_quality']}) except in (\ref{['subfig:priority_runners']}) the lead runner has $w_f\!=\!10$ and in (\ref{['subfig:priority_speaker']}) the stationary actor (the speaker) has $w_f\!=\!5$.
Figure 4: Trajectories from Multi-Round Greedy on the cross-mix scenario: Each uniquely colored circle represents an actor, and pairs initially have similar colors. At the point of crossing, two out of the three groups swap partners. Our coordination scheme naturally handles the complex actor movement and produces good view diversity. Shown below are the four camera views used in the rendering based evaluation. Each face of each actor has a unique color.
Figure 5: View rewards plotted for each planner for select scenarios: In all cases, peaks correspond to when actors are near each other and troughs to when they are far apart. Plots show SRPPA as approximated by \ref{['eq:pixelterm']}, except for (\ref{['subfig:evaluation_split_and_join']}) split-and-join we also compare to the image-based evaluation (Sec. \ref{['sec:rendering_evaluation']}).

Theorems & Definitions (14)

Remark 1: Intuition for perception objective
Theorem 1: Monotonicity properties of SRPPA
Corollary 1.1: Bounded suboptimality
Lemma 1: Monotonicity for composing a real and a modular function
Remark 2: Transformations with other real functions
Remark 3: Relationship between coverage and alternating derivatives
Definition 1: Derivative of a set function
Definition 2: Higher-order monotonicity of set functions
Definition 3: Monotonicity of real functions
Definition 4: Modular set function
...and 4 more

Multi-Robot Planning for Filming Groups of Moving Actors Leveraging Submodularity and Pixel Density

TL;DR

Abstract

Multi-Robot Planning for Filming Groups of Moving Actors Leveraging Submodularity and Pixel Density

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (14)