Table of Contents
Fetching ...

Motion as a Sensing Modality for Metric Scale in Monocular Visual-Inertial Odometry

Hadush Hailu, Bruk Gebregziabher

Abstract

Monocular visual-inertial odometry (VIO) cannot recover metric scale from vision alone; scale must be resolved through inertial measurements. We present a trajectory-dependent observability analysis showing that translational acceleration, produced by curvature, not constant-speed straight-line travel, is the fundamental source that couples scale to the inertial state. This relationship is formalized through the gravity-acceleration asymmetry in the IMU model, from which we derive rank conditions on the observability matrix and propose a lightweight excitation metric computable from raw IMU data. Controlled experiments on a differential-drive robot with a monocular camera and consumer-grade IMU validate the theory, with straight-line motion yielding 9.2% scale error, circular motion 6.4%, and figure-eight motion 4.8%, with excitation spanning four orders of magnitude. These results establish trajectory design as a practical mechanism for improving metric scale recovery.

Motion as a Sensing Modality for Metric Scale in Monocular Visual-Inertial Odometry

Abstract

Monocular visual-inertial odometry (VIO) cannot recover metric scale from vision alone; scale must be resolved through inertial measurements. We present a trajectory-dependent observability analysis showing that translational acceleration, produced by curvature, not constant-speed straight-line travel, is the fundamental source that couples scale to the inertial state. This relationship is formalized through the gravity-acceleration asymmetry in the IMU model, from which we derive rank conditions on the observability matrix and propose a lightweight excitation metric computable from raw IMU data. Controlled experiments on a differential-drive robot with a monocular camera and consumer-grade IMU validate the theory, with straight-line motion yielding 9.2% scale error, circular motion 6.4%, and figure-eight motion 4.8%, with excitation spanning four orders of magnitude. These results establish trajectory design as a practical mechanism for improving metric scale recovery.

Paper Structure

This paper contains 37 sections, 2 theorems, 20 equations, 15 figures, 7 tables.

Key Result

Proposition 1

Let the accelerometer measurement be modeled by eq:scaled_imu. The Fisher information contributed by a single IMU sample toward the scale parameter $s$ is where $\sigma_a^2$ is the accelerometer noise variance. Consequently, scale information vanishes if and only if translational acceleration vanishes.

Figures (15)

  • Figure 1: Monocular scale ambiguity (pinhole projection). The optical center $O$ and image plane (at focal distance $f$) define the camera. A 3D point $\mathbf{P}=(X,Y,Z)$ (blue) and its scaled counterpart $s\mathbf{P}$ (red) lie on the same bearing ray and therefore project onto the identical image point $\mathbf{p}=(x,y,f)$. Because the projection $\mathbf{p} = f\,\mathbf{P}/Z$ is invariant to global rescaling $\mathbf{P}\!\to\!s\mathbf{P}$, metric scale is unrecoverable from monocular images alone.
  • Figure 2: Accelerometer measurement decomposition. The world frame $W$ and body (IMU) frame $I$ are related by rotation $\mathbf{R}_{WI}$. The accelerometer reading $\mathbf{a}_m$ decomposes into three components, namely scale-dependent body acceleration $\textcolor{red!70!black}{s\,\mathbf{R}_{WI}^\top\dot{\mathbf{v}}}$ (red), gravity $\textcolor{blue!60!black}{\mathbf{R}_{WI}^\top \mathbf{g}}$ (blue, fixed magnitude), and bias $\textcolor{orange!70!black}{\mathbf{b}_a}$ (orange). Because only the acceleration term scales with $s$ while gravity provides a fixed reference, inertial measurements break the monocular scale ambiguity.
  • Figure 3: Trajectory-dependent scale observability. Straight-line motion (a) produces negligible translational acceleration, leaving scale unobservable. Constant curvature (b) injects steady centripetal acceleration in a fixed direction. Figure-eight motion (c) generates time-varying acceleration with reversing curvature (dotted arrows), maximizing $\partial\mathbf{a}_m/\partial s$. Bars indicate relative scale information.
  • Figure 4: Differential-drive robot platform. (a) Photograph of the four-wheel robot. (b) Top-view schematic showing component placement and coordinate frames. Body $\{B\}$ has $x_B$ forward, $y_B$ left, $z_B$ up ($\odot$). Camera $\{C\}$ has $z_C \!\parallel\! x_B$ (optical axis forward), $x_C$ out of page ($\odot$). IMU $\{I\}$ has $x_I \!\parallel\! x_B$, $z_I$ inverted, pointing toward $\mathbf{g}$ ($\otimes$).
  • Figure 5: Experimental environment. (a) Indoor workspace with textured floor providing visual features. (b) Plan view showing the three trajectory types executed within the workspace (all ${\approx}\,3$ m path length).
  • ...and 10 more figures

Theorems & Definitions (3)

  • Proposition 1: Scale--acceleration coupling
  • Remark 1: Geometric intuition
  • Corollary 1: Excitation as a scale-information proxy