Learning to Predict 3D Rotational Dynamics from Images of a Rigid Body with Unknown Mass Distribution
Justice Mason, Christine Allen-Blanchette, Nicholas Zolman, Elizabeth Davison, Naomi Ehrich Leonard
TL;DR
This work tackles predicting 3D rotational dynamics of freely rotating rigid bodies from image sequences when internal mass distributions are unknown. It introduces a physics-informed neural network that maps images to a latent $SO(3)$ representation, computes angular velocities from latent pairs, and evolves the state using a generalized Hamiltonian (Lie–Poisson) framework, with a learned moment-of-inertia tensor. The model is trained with a combination of autoencoder and dynamics-based losses plus latent consistency terms, and evaluated on six synthetic datasets (including cubes, prisms, CALIPSO, and CloudSat) where it outperforms LSTM, Neural ODE, and HGN baselines, achieving long-horizon, accurate image predictions. The results highlight the value of enforcing $SO(3)$-structured latent spaces and Hamiltonian structure for interpretable, stable, and scalable dynamics learning from high-dimensional image data, with implications for space robotics and beyond.
Abstract
In many real-world settings, image observations of freely rotating 3D rigid bodies may be available when low-dimensional measurements are not. However, the high-dimensionality of image data precludes the use of classical estimation techniques to learn the dynamics. The usefulness of standard deep learning methods is also limited, because an image of a rigid body reveals nothing about the distribution of mass inside the body, which, together with initial angular velocity, is what determines how the body will rotate. We present a physics-based neural network model to estimate and predict 3D rotational dynamics from image sequences. We achieve this using a multi-stage prediction pipeline that maps individual images to a latent representation homeomorphic to $\mathbf{SO}(3)$, computes angular velocities from latent pairs, and predicts future latent states using the Hamiltonian equations of motion. We demonstrate the efficacy of our approach on new rotating rigid-body datasets of sequences of synthetic images of rotating objects, including cubes, prisms and satellites, with unknown uniform and non-uniform mass distributions. Our model outperforms competing baselines on our datasets, producing better qualitative predictions and reducing the error observed for the state-of-the-art Hamiltonian Generative Network by a factor of 2.
