Table of Contents
Fetching ...

Learning to Predict 3D Rotational Dynamics from Images of a Rigid Body with Unknown Mass Distribution

Justice Mason, Christine Allen-Blanchette, Nicholas Zolman, Elizabeth Davison, Naomi Ehrich Leonard

TL;DR

This work tackles predicting 3D rotational dynamics of freely rotating rigid bodies from image sequences when internal mass distributions are unknown. It introduces a physics-informed neural network that maps images to a latent $SO(3)$ representation, computes angular velocities from latent pairs, and evolves the state using a generalized Hamiltonian (Lie–Poisson) framework, with a learned moment-of-inertia tensor. The model is trained with a combination of autoencoder and dynamics-based losses plus latent consistency terms, and evaluated on six synthetic datasets (including cubes, prisms, CALIPSO, and CloudSat) where it outperforms LSTM, Neural ODE, and HGN baselines, achieving long-horizon, accurate image predictions. The results highlight the value of enforcing $SO(3)$-structured latent spaces and Hamiltonian structure for interpretable, stable, and scalable dynamics learning from high-dimensional image data, with implications for space robotics and beyond.

Abstract

In many real-world settings, image observations of freely rotating 3D rigid bodies may be available when low-dimensional measurements are not. However, the high-dimensionality of image data precludes the use of classical estimation techniques to learn the dynamics. The usefulness of standard deep learning methods is also limited, because an image of a rigid body reveals nothing about the distribution of mass inside the body, which, together with initial angular velocity, is what determines how the body will rotate. We present a physics-based neural network model to estimate and predict 3D rotational dynamics from image sequences. We achieve this using a multi-stage prediction pipeline that maps individual images to a latent representation homeomorphic to $\mathbf{SO}(3)$, computes angular velocities from latent pairs, and predicts future latent states using the Hamiltonian equations of motion. We demonstrate the efficacy of our approach on new rotating rigid-body datasets of sequences of synthetic images of rotating objects, including cubes, prisms and satellites, with unknown uniform and non-uniform mass distributions. Our model outperforms competing baselines on our datasets, producing better qualitative predictions and reducing the error observed for the state-of-the-art Hamiltonian Generative Network by a factor of 2.

Learning to Predict 3D Rotational Dynamics from Images of a Rigid Body with Unknown Mass Distribution

TL;DR

This work tackles predicting 3D rotational dynamics of freely rotating rigid bodies from image sequences when internal mass distributions are unknown. It introduces a physics-informed neural network that maps images to a latent representation, computes angular velocities from latent pairs, and evolves the state using a generalized Hamiltonian (Lie–Poisson) framework, with a learned moment-of-inertia tensor. The model is trained with a combination of autoencoder and dynamics-based losses plus latent consistency terms, and evaluated on six synthetic datasets (including cubes, prisms, CALIPSO, and CloudSat) where it outperforms LSTM, Neural ODE, and HGN baselines, achieving long-horizon, accurate image predictions. The results highlight the value of enforcing -structured latent spaces and Hamiltonian structure for interpretable, stable, and scalable dynamics learning from high-dimensional image data, with implications for space robotics and beyond.

Abstract

In many real-world settings, image observations of freely rotating 3D rigid bodies may be available when low-dimensional measurements are not. However, the high-dimensionality of image data precludes the use of classical estimation techniques to learn the dynamics. The usefulness of standard deep learning methods is also limited, because an image of a rigid body reveals nothing about the distribution of mass inside the body, which, together with initial angular velocity, is what determines how the body will rotate. We present a physics-based neural network model to estimate and predict 3D rotational dynamics from image sequences. We achieve this using a multi-stage prediction pipeline that maps individual images to a latent representation homeomorphic to , computes angular velocities from latent pairs, and predicts future latent states using the Hamiltonian equations of motion. We demonstrate the efficacy of our approach on new rotating rigid-body datasets of sequences of synthetic images of rotating objects, including cubes, prisms and satellites, with unknown uniform and non-uniform mass distributions. Our model outperforms competing baselines on our datasets, producing better qualitative predictions and reducing the error observed for the state-of-the-art Hamiltonian Generative Network by a factor of 2.
Paper Structure (34 sections, 11 equations, 10 figures, 4 tables)

This paper contains 34 sections, 11 equations, 10 figures, 4 tables.

Figures (10)

  • Figure S1: Simulations illustrating how mass distribution and initial angular velocity determine behavior. (a) Tumbling prism: uniform mass distribution ($\mathbf{J}$ = $\mathbf{J}_1$) and initial angular velocity near an unstable solution. (b) Spinning prism: $\mathbf{J} = \mathbf{J}_1$ and initial angular velocity near a stable solution. (c) Spinning CALIPSO satellite: $\mathbf{J} = \mathbf{J}_1$ and same initial angular velocity as (b). (d) Wobbling prism: non-uniform mass distribution ($\mathbf{J}$ = $\mathbf{J}_3$) and same initial velocity as (b).
  • Figure S2: A schematic of the model's forward pass at training time and inference. (a) Encoding pipeline at training; (b) encoding pipeline at inference; (c) decoding for auto-encoding reconstruction; and (d) dynamics prediction and decoding for dynamics-based reconstruction.
  • Figure S3: Predicted sequences for uniform and non-uniform mass density cube and prism datasets given by our model. The figure shows predicted images at time steps $\tau =$ 0 to 5 and $\tau =$ 45 to 50.
  • Figure S4: Predicted sequences for the CALIPSO satellite (top) and CloudSat satellite (bottom) with uniform mass densities given by our model. The figure shows predicted images at every 10th time step from $\tau =$ 0 to 90.
  • Figure A1: Predicted sequences for uniform/non-uniform prism and cube datasets given by the LSTM-baseline. The figure shows time steps $\tau = 10$ through $\tau = 20$. These are the first 11 predictions of the model.
  • ...and 5 more figures