Table of Contents
Fetching ...

A Monocular Event-Camera Motion Capture System

Leonard Bauersfeld, Davide Scaramuzza

TL;DR

The paper presents a monocular motion capture system that leverages a static event camera and blinking LED markers to recover full 6-DOF pose via a Perspective-n-Point solve. It introduces a novel Signed Delta-Time Volume representation for robust LED frequency detection and uses SQPnP for fast, globally stable pose estimation, achieving millimeter accuracy with millisecond latency. The approach is validated through static-noise experiments showing SqPnP advantages over EPnP and a real-time closed-loop quadrotor flight at ~400 Hz, with end-to-end latency around 2.5 ms. This work enables high-precision pose tracking in narrow, confined spaces at a lower cost and with a compact form factor, broadening applicability in mobile robotics and manipulation.

Abstract

Motion capture systems are a widespread tool in research to record ground-truth poses of objects. Commercial systems use reflective markers attached to the object and then triangulate pose of the object from multiple camera views. Consequently, the object must be visible to multiple cameras which makes such multi-view motion capture systems unsuited for deployments in narrow, confined spaces (e.g. ballast tanks of ships). In this technical report we describe a monocular event-camera motion capture system which overcomes this limitation and is ideally suited for narrow spaces. Instead of passive markers it relies on active, blinking LED markers such that each marker can be uniquely identified from the blinking frequency. The markers are placed at known locations on the tracking object. We then solve the PnP (perspective-n-points) problem to obtain the position and orientation of the object. The developed system has millimeter accuracy, millisecond latency and we demonstrate that its state estimate can be used to fly a small, agile quadrotor.

A Monocular Event-Camera Motion Capture System

TL;DR

The paper presents a monocular motion capture system that leverages a static event camera and blinking LED markers to recover full 6-DOF pose via a Perspective-n-Point solve. It introduces a novel Signed Delta-Time Volume representation for robust LED frequency detection and uses SQPnP for fast, globally stable pose estimation, achieving millimeter accuracy with millisecond latency. The approach is validated through static-noise experiments showing SqPnP advantages over EPnP and a real-time closed-loop quadrotor flight at ~400 Hz, with end-to-end latency around 2.5 ms. This work enables high-precision pose tracking in narrow, confined spaces at a lower cost and with a compact form factor, broadening applicability in mobile robotics and manipulation.

Abstract

Motion capture systems are a widespread tool in research to record ground-truth poses of objects. Commercial systems use reflective markers attached to the object and then triangulate pose of the object from multiple camera views. Consequently, the object must be visible to multiple cameras which makes such multi-view motion capture systems unsuited for deployments in narrow, confined spaces (e.g. ballast tanks of ships). In this technical report we describe a monocular event-camera motion capture system which overcomes this limitation and is ideally suited for narrow spaces. Instead of passive markers it relies on active, blinking LED markers such that each marker can be uniquely identified from the blinking frequency. The markers are placed at known locations on the tracking object. We then solve the PnP (perspective-n-points) problem to obtain the position and orientation of the object. The developed system has millimeter accuracy, millisecond latency and we demonstrate that its state estimate can be used to fly a small, agile quadrotor.

Paper Structure

This paper contains 19 sections, 3 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Definition of the camera frame ($z_\mathcal{C}$ is aligned with the optical axis), the body frame $\mathcal{B}$ and the world frame $\mathcal{W}$. The position of the active markers is defined in body-frame and must be known. The transform $T_{\mathcal{C} \mathcal{B}}$ is estimated through PnP and the pose of the camera in the world frame is assumed to be known.
  • Figure 2: The object is placed at distances between [70]cm and [5]m statically in front of the camera (with the [25]mm lens). The plots show the standard deviation in the position measurement ($z_\mathcal{C}$ and $x_\mathcal{C}$, $y_\mathcal{C}$) as well as the orientation measurements. We can clearly see that SqPnP terzakis2020sqpnp outperforms EPnP lepetit2009epnp by a large margin.
  • Figure 3: Closed-loop experiments: the drone flies a rectangular pattern starting at (2,0.1) and then lands at a distance of [2.5]m. The event-camera is located at the origin of the coordinate system at $(x_\mathcal{W}, y_\mathcal{W}) = (0,0)$ and the optical axis of the [50]mm lens is aligned with the $x_\mathcal{W}$ direction. The bottom plot shows the roll and pitch angle measurements during the flight.
  • Figure 4: Illustration on the construction of the Signed Delta-Time Volume (SDTV) from an event stream. a) The LED is blinking with a period of $\unit[300]{\mu s}$ with a duty cycle of [10]%. b) A single pixel of the event camera records a noisy signal of this blinking LED. False double events (e.g. at $t = \unit[150]{\mu s}, \unit[165]{\mu s}$) and spurious events (e.g. at $t = \unit[630]{\mu s}$) are included. c) Construction of the SDTV illustrated before processing the latest time window and after processing the time window. d) Periods robustly identified from the SDTV by summing up absolute time differences between negative $\rightarrow$ positive transitions (the first positive value is included). All events until the first positive $\rightarrow$ negative transition are discarded.
  • Figure 5: Circuit diagram of the complete blinking LED circuit. For simplicity we only show one LED driver circuit (area enclosed in dotted line), which is then replicated multiple times to drive multiple LEDs. The input voltage range for the power supply is $V_\text{DD}$ is [8-35]V. The resistor and capacitor values $R_A$, $R_B$, and $C$ for the A-stable NE555 operation at different frequencies are given in Tab. \ref{['tab:resistor_capacitor_values']}. For improved clarity the standard [100]nF ceramic bypass capacitors to stabilize VCC for the NE555 and ADG802 have been omitted in the circuit diagram above.
  • ...and 1 more figures