Learning to Predict 3D Rotational Dynamics from Images of a Rigid Body with Unknown Mass Distribution

Justice Mason; Christine Allen-Blanchette; Nicholas Zolman; Elizabeth Davison; Naomi Ehrich Leonard

Learning to Predict 3D Rotational Dynamics from Images of a Rigid Body with Unknown Mass Distribution

Justice Mason, Christine Allen-Blanchette, Nicholas Zolman, Elizabeth Davison, Naomi Ehrich Leonard

TL;DR

This work tackles predicting 3D rotational dynamics of freely rotating rigid bodies from image sequences when internal mass distributions are unknown. It introduces a physics-informed neural network that maps images to a latent $SO(3)$ representation, computes angular velocities from latent pairs, and evolves the state using a generalized Hamiltonian (Lie–Poisson) framework, with a learned moment-of-inertia tensor. The model is trained with a combination of autoencoder and dynamics-based losses plus latent consistency terms, and evaluated on six synthetic datasets (including cubes, prisms, CALIPSO, and CloudSat) where it outperforms LSTM, Neural ODE, and HGN baselines, achieving long-horizon, accurate image predictions. The results highlight the value of enforcing $SO(3)$-structured latent spaces and Hamiltonian structure for interpretable, stable, and scalable dynamics learning from high-dimensional image data, with implications for space robotics and beyond.

Abstract

In many real-world settings, image observations of freely rotating 3D rigid bodies may be available when low-dimensional measurements are not. However, the high-dimensionality of image data precludes the use of classical estimation techniques to learn the dynamics. The usefulness of standard deep learning methods is also limited, because an image of a rigid body reveals nothing about the distribution of mass inside the body, which, together with initial angular velocity, is what determines how the body will rotate. We present a physics-based neural network model to estimate and predict 3D rotational dynamics from image sequences. We achieve this using a multi-stage prediction pipeline that maps individual images to a latent representation homeomorphic to $\mathbf{SO}(3)$, computes angular velocities from latent pairs, and predicts future latent states using the Hamiltonian equations of motion. We demonstrate the efficacy of our approach on new rotating rigid-body datasets of sequences of synthetic images of rotating objects, including cubes, prisms and satellites, with unknown uniform and non-uniform mass distributions. Our model outperforms competing baselines on our datasets, producing better qualitative predictions and reducing the error observed for the state-of-the-art Hamiltonian Generative Network by a factor of 2.

Learning to Predict 3D Rotational Dynamics from Images of a Rigid Body with Unknown Mass Distribution

TL;DR

representation, computes angular velocities from latent pairs, and evolves the state using a generalized Hamiltonian (Lie–Poisson) framework, with a learned moment-of-inertia tensor. The model is trained with a combination of autoencoder and dynamics-based losses plus latent consistency terms, and evaluated on six synthetic datasets (including cubes, prisms, CALIPSO, and CloudSat) where it outperforms LSTM, Neural ODE, and HGN baselines, achieving long-horizon, accurate image predictions. The results highlight the value of enforcing

-structured latent spaces and Hamiltonian structure for interpretable, stable, and scalable dynamics learning from high-dimensional image data, with implications for space robotics and beyond.

Abstract

, computes angular velocities from latent pairs, and predicts future latent states using the Hamiltonian equations of motion. We demonstrate the efficacy of our approach on new rotating rigid-body datasets of sequences of synthetic images of rotating objects, including cubes, prisms and satellites, with unknown uniform and non-uniform mass distributions. Our model outperforms competing baselines on our datasets, producing better qualitative predictions and reducing the error observed for the state-of-the-art Hamiltonian Generative Network by a factor of 2.

Paper Structure (34 sections, 11 equations, 10 figures, 4 tables)

This paper contains 34 sections, 11 equations, 10 figures, 4 tables.

Introduction
Related Work
Background
The S2 x S2 Parameterization of 3D Rotation Group SO(3)
3D Rotating Rigid-Body Kinematics
3D Rigid-Body Dynamics in Hamiltonian Form
Materials and Methods
Notation
Embedding Images to an SO(3) Latent Space
Computing Dynamics in the Latent Space
Decoding SO(3) Latent States to Images
Training Methodology
Reconstruction Losses
Latent Losses
3D Rotating Rigid-Body Datasets
...and 19 more sections

Figures (10)

Figure S1: Simulations illustrating how mass distribution and initial angular velocity determine behavior. (a) Tumbling prism: uniform mass distribution ($\mathbf{J}$ = $\mathbf{J}_1$) and initial angular velocity near an unstable solution. (b) Spinning prism: $\mathbf{J} = \mathbf{J}_1$ and initial angular velocity near a stable solution. (c) Spinning CALIPSO satellite: $\mathbf{J} = \mathbf{J}_1$ and same initial angular velocity as (b). (d) Wobbling prism: non-uniform mass distribution ($\mathbf{J}$ = $\mathbf{J}_3$) and same initial velocity as (b).
Figure S2: A schematic of the model's forward pass at training time and inference. (a) Encoding pipeline at training; (b) encoding pipeline at inference; (c) decoding for auto-encoding reconstruction; and (d) dynamics prediction and decoding for dynamics-based reconstruction.
Figure S3: Predicted sequences for uniform and non-uniform mass density cube and prism datasets given by our model. The figure shows predicted images at time steps $\tau =$ 0 to 5 and $\tau =$ 45 to 50.
Figure S4: Predicted sequences for the CALIPSO satellite (top) and CloudSat satellite (bottom) with uniform mass densities given by our model. The figure shows predicted images at every 10th time step from $\tau =$ 0 to 90.
Figure A1: Predicted sequences for uniform/non-uniform prism and cube datasets given by the LSTM-baseline. The figure shows time steps $\tau = 10$ through $\tau = 20$. These are the first 11 predictions of the model.
...and 5 more figures

Learning to Predict 3D Rotational Dynamics from Images of a Rigid Body with Unknown Mass Distribution

TL;DR

Abstract

Learning to Predict 3D Rotational Dynamics from Images of a Rigid Body with Unknown Mass Distribution

Authors

TL;DR

Abstract

Table of Contents

Figures (10)