Table of Contents
Fetching ...

Metric, inertially aligned monocular state estimation via kinetodynamic priors

Jiaxin Liu, Min Li, Wanting Xu, Liang Li, Jiaqi Yang, Laurent Kneip

TL;DR

The paper tackles monocular state estimation for non-rigid platforms where deformation breaks rigid-body assumptions. It introduces a kineto-dynamic framework that combines continuous-time B-Spline motion on $\mathbb{SE}3$ with a Deformation-force Network mapping relative pose to accelerations, linking observed trajectory acceleration to deformation-induced dynamics through $F=ma$. The approach yields metric scale and gravity alignment using only a single camera by attributing unmodeled motion to elastic deformation, effectively enabling passive inertial sensing. Experimental validation on a spring-camera setup, including simulations and real data, demonstrates robust scale recovery and inertial alignment, suggesting broad potential for flexible robotic platforms with known motion models and elastic actuation.

Abstract

Accurate state estimation for flexible robotic systems poses significant challenges, particular for platforms with dynamically deforming structures that invalidate rigid-body assumptions. This paper tackles this problem and allows to extend existing rigid-body pose estimation methods to non-rigid systems. Our approach hinges on two core assumptions: first, the elastic properties are captured by an injective deformation-force model, efficiently learned via a Multi-Layer Perceptron; second, we solve the platform's inherently smooth motion using continuous-time B-spline kinematic models. By continuously applying Newton's Second Law, our method establishes a physical link between visually-derived trajectory acceleration and predicted deformation-induced acceleration. We demonstrate that our approach not only enables robust and accurate pose estimation on non-rigid platforms, but that the properly modeled platform physics instigate inertial sensing properties. We demonstrate this feasibility on a simple spring-camera system, and show how it robustly resolves the typically ill-posed problem of metric scale and gravity recovery in monocular visual odometry.

Metric, inertially aligned monocular state estimation via kinetodynamic priors

TL;DR

The paper tackles monocular state estimation for non-rigid platforms where deformation breaks rigid-body assumptions. It introduces a kineto-dynamic framework that combines continuous-time B-Spline motion on with a Deformation-force Network mapping relative pose to accelerations, linking observed trajectory acceleration to deformation-induced dynamics through . The approach yields metric scale and gravity alignment using only a single camera by attributing unmodeled motion to elastic deformation, effectively enabling passive inertial sensing. Experimental validation on a spring-camera setup, including simulations and real data, demonstrates robust scale recovery and inertial alignment, suggesting broad potential for flexible robotic platforms with known motion models and elastic actuation.

Abstract

Accurate state estimation for flexible robotic systems poses significant challenges, particular for platforms with dynamically deforming structures that invalidate rigid-body assumptions. This paper tackles this problem and allows to extend existing rigid-body pose estimation methods to non-rigid systems. Our approach hinges on two core assumptions: first, the elastic properties are captured by an injective deformation-force model, efficiently learned via a Multi-Layer Perceptron; second, we solve the platform's inherently smooth motion using continuous-time B-spline kinematic models. By continuously applying Newton's Second Law, our method establishes a physical link between visually-derived trajectory acceleration and predicted deformation-induced acceleration. We demonstrate that our approach not only enables robust and accurate pose estimation on non-rigid platforms, but that the properly modeled platform physics instigate inertial sensing properties. We demonstrate this feasibility on a simple spring-camera system, and show how it robustly resolves the typically ill-posed problem of metric scale and gravity recovery in monocular visual odometry.

Paper Structure

This paper contains 23 sections, 21 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Non-rigid monocular camera system demonstrating trajectory divergence. Unlike rigid systems, the deformable connection induces camera oscillations (green) distinct from the platform's motion (blue).
  • Figure 2: Block diagram of this pipeline, mainly contains two part. a) Modeling: We use a Deformation-force Network (DFN) to implicitly model the non-rigid connection, incorporating both linear and angular accelerations. It's important to note that our modeling is offline. b) Solving: We utilize odometry to recover the camera's trajectory first, as shown in the green section of the figure. An optimization objective function is then formulated based on Eq. \ref{['opt-bspline']} and minimized. Specifically, we optimize the control knots in B-Spline instead of optimizing the pose directly, as shown in the brown section of the figure.
  • Figure 3: Hardware setup. A typical non-rigid system usually consists of a base, a camera, and a non-rigid connection (we use a spring as the non-rigid connection). The poses that need to be estimated are the pose and scale of the base.
  • Figure 4: Optimization results for representative trajectories. For improved visualization and comparison, X-Z 2D projections are shown due to the subtle and less significant Y-axis (gravity direction) motion. Each subplot depicts the ground truth trajectory (solid line) and our optimized results (dashed line), with color intensity indicating the time scale.
  • Figure A1: Trajectories for different states of motion.
  • ...and 1 more figures