Table of Contents
Fetching ...

Artificial Microsaccade Compensation: Stable Vision for an Ornithopter

Levi Burner, Guido de Croon, Yiannis Aloimonos

TL;DR

A method for artificial Microsaccade Compensation that can stabilize video captured by a tailless ornithopter that has resisted attempts to use camera-based sensing because it shakes at 12-20 Hz, and achieves higher quality results while also running in real time.

Abstract

Animals with foveated vision, including humans, experience microsaccades, small, rapid eye movements that they are not aware of. Inspired by this phenomenon, we develop a method for "Artificial Microsaccade Compensation". It can stabilize video captured by a tailless ornithopter that has resisted attempts to use camera-based sensing because it shakes at 12-20 Hz. Our approach minimizes changes in image intensity by optimizing over 3D rotation represented in SO(3). This results in a stabilized video, computed in real time, suitable for human viewing, and free from distortion. When adapted to hold a fixed viewing orientation, up to occasional saccades, it can dramatically reduce inter-frame motion while also benefiting from an efficient recursive update. When compared to Adobe Premier Pro's warp stabilizer, which is widely regarded as the best commercial video stabilization software available, our method achieves higher quality results while also running in real time.

Artificial Microsaccade Compensation: Stable Vision for an Ornithopter

TL;DR

A method for artificial Microsaccade Compensation that can stabilize video captured by a tailless ornithopter that has resisted attempts to use camera-based sensing because it shakes at 12-20 Hz, and achieves higher quality results while also running in real time.

Abstract

Animals with foveated vision, including humans, experience microsaccades, small, rapid eye movements that they are not aware of. Inspired by this phenomenon, we develop a method for "Artificial Microsaccade Compensation". It can stabilize video captured by a tailless ornithopter that has resisted attempts to use camera-based sensing because it shakes at 12-20 Hz. Our approach minimizes changes in image intensity by optimizing over 3D rotation represented in SO(3). This results in a stabilized video, computed in real time, suitable for human viewing, and free from distortion. When adapted to hold a fixed viewing orientation, up to occasional saccades, it can dramatically reduce inter-frame motion while also benefiting from an efficient recursive update. When compared to Adobe Premier Pro's warp stabilizer, which is widely regarded as the best commercial video stabilization software available, our method achieves higher quality results while also running in real time.

Paper Structure

This paper contains 17 sections, 16 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: Qualitative comparison of stabilized and unstabilized frames. Top row, 6 averaged consecutive frames from a sequence illustrating the extreme shaking of a camera onboard the Flapper Nimble+ tailless ornithopter. Second row, a single unstabilized frame with normal flow (the projection of optical-flow onto gradients) illustrated with red arrows. Third row, 6 stabilized and averaged frames, which were warped to the stabilized viewpoint. Bottom row, 6 stabilized and averaged frames which were warped to a stationary viewpoint that occasionally sacccades. The flow magnitude can be seen to reduce with each row, with the saccading variation of the algorithm achieving a drastic reduction in flow magnitude. Videos showing stabilization results are available in the supplementary information.
  • Figure 2: Per frame metrics computed on sequence FF2. Normal flow and the RMS change in image brightness are reduced by the stabilization and significantly reduced by saccading instead of continuously rotating the stable view. The proposed frame averaging, which reduces the effects of rolling shutter and image artifacts, slightly reduces the stabilized image sharpness. The stabilization algorithm maintains a high quantity of valid pixels given 12.5% image margins. The saccading variant results in slightly fewer valid pixels. During flight, the angular velocities from the Flapper's onboard IMU do not align with the angular velocities of the image based orientation estimate. The stabilized orientation does not contain the high frequency oscillations of the estimated orientation. The saccade style stabilization maintains a viewpoint angular velocity of zero, except when it saccades, which results in an angular velocity spike.
  • Figure 3: Zoomed in view of angular velocity estimates. Left: Angular velocity estimates from an onboard IMU, motion capture, the image orientation tracker, stabilized view, and saccade times during flight. The Flapper's onboard IMU fails to capture accurate enough angular velocity estimates for image stabilization during flight. Right: When moved smoothly by hand, the Flapper's onboard IMU's angular velocity estimates almost perfectly align with the image based angular velocity estimates. The motion capture system's angular velocity estimates also align, except for estimates along the Y axis, which are poor in quality because of occluded markers.
  • Figure 4: Artificial Microsaccade Compensation allows stabilizing unstable videos, such as those captured by the Flapper Nimble+, by directly matching incoming frames to a periodically updated template frame. Under the assumption of small rotational disturbances, a direct optimization of the mean squared error between images is used to continuously update an orientation estimate $\hat{R}$. Subsequently, a smoothed viewpoint $R^{\mathrm{view}}$ is computed. The simplest option for the stable viewpoint is a constant orientation. However, a more sophisticated approach that smoothly tracks the system's viewpoint can be realized with a low-pass filter on the group of rotation matrices $SO(3)$. The frames and rotations are buffered and used in the Artificial Microsaccade Compensation process, which combines multiple frames, taken at multiple times, to realize a high quality and stable video that is free from distortion.
  • Figure 5: A detailed view of the "Stabilization" block in Figure \ref{['fig:algorithm']}. The left column is the average of 6 consecutive frames. While the individual frames are sharp, this averaging results in blur, which illustrates that the camera is shaking aggressively. The right column consists of the same six frames after stabilization and then averaging. In this paper, multiple frames are warped to the current stabilized view and averaged to reduce the effects of rolling shutter and transmission errors of a wireless camera onboard the Flapper. All displayed frames were computed using the camera's full resolution. Each timestep is associated with estimated orientations, $\hat{R}$ computed via a Lucas-Kanade tracker on the group of rotation matrices, $SO(3)$, and $R^{\mathrm{view}}$ a list of stable orientations computed by low-pass filtering $\hat{R}$. The stabilized output frame at the current time is computed by warping current and previous frames to the current stabilized viewpoint using stabilizing rotations $\hat{R}^{\mathrm{stab}}_{i,j} = R^{\mathrm{view}}_{i,0}\hat{R}_{0,j}$.