A Monocular Event-Camera Motion Capture System
Leonard Bauersfeld, Davide Scaramuzza
TL;DR
The paper presents a monocular motion capture system that leverages a static event camera and blinking LED markers to recover full 6-DOF pose via a Perspective-n-Point solve. It introduces a novel Signed Delta-Time Volume representation for robust LED frequency detection and uses SQPnP for fast, globally stable pose estimation, achieving millimeter accuracy with millisecond latency. The approach is validated through static-noise experiments showing SqPnP advantages over EPnP and a real-time closed-loop quadrotor flight at ~400 Hz, with end-to-end latency around 2.5 ms. This work enables high-precision pose tracking in narrow, confined spaces at a lower cost and with a compact form factor, broadening applicability in mobile robotics and manipulation.
Abstract
Motion capture systems are a widespread tool in research to record ground-truth poses of objects. Commercial systems use reflective markers attached to the object and then triangulate pose of the object from multiple camera views. Consequently, the object must be visible to multiple cameras which makes such multi-view motion capture systems unsuited for deployments in narrow, confined spaces (e.g. ballast tanks of ships). In this technical report we describe a monocular event-camera motion capture system which overcomes this limitation and is ideally suited for narrow spaces. Instead of passive markers it relies on active, blinking LED markers such that each marker can be uniquely identified from the blinking frequency. The markers are placed at known locations on the tracking object. We then solve the PnP (perspective-n-points) problem to obtain the position and orientation of the object. The developed system has millimeter accuracy, millisecond latency and we demonstrate that its state estimate can be used to fly a small, agile quadrotor.
