Vision-based Discovery of Nonlinear Dynamics for 3D Moving Target
Zitong Zhang, Yang Liu, Hao Sun
TL;DR
This work addresses learning nonlinear 3D dynamics directly from video by fusing multi-view target tracking, Rodrigues' rotation-based coordinate transformation, and a spline-enhanced library-based sparse regressor. It reconstructs 3D trajectories from a three-camera setup with calibration of only one camera and uses cubic B-splines to model the trajectory while enforcing physics constraints through a collocation-based sparse regression framework. The approach yields compact governing equations that closely match ground-truth dynamics across multiple synthetic chaotic systems and outperforms PySINDy in 3D equation discovery, even under noise and data gaps. The results demonstrate a practical pathway for vision-based discovery of dynamics with potential applications in robotics, surveillance, and scientific sensing, and point to future work on real-world videos and multi-target dynamics.
Abstract
Data-driven discovery of governing equations has kindled significant interests in many science and engineering areas. Existing studies primarily focus on uncovering equations that govern nonlinear dynamics based on direct measurement of the system states (e.g., trajectories). Limited efforts have been placed on distilling governing laws of dynamics directly from videos for moving targets in a 3D space. To this end, we propose a vision-based approach to automatically uncover governing equations of nonlinear dynamics for 3D moving targets via raw videos recorded by a set of cameras. The approach is composed of three key blocks: (1) a target tracking module that extracts plane pixel motions of the moving target in each video, (2) a Rodrigues' rotation formula-based coordinate transformation learning module that reconstructs the 3D coordinates with respect to a predefined reference point, and (3) a spline-enhanced library-based sparse regressor that uncovers the underlying governing law of dynamics. This framework is capable of effectively handling the challenges associated with measurement data, e.g., noise in the video, imprecise tracking of the target that causes data missing, etc. The efficacy of our method has been demonstrated through multiple sets of synthetic videos considering different nonlinear dynamics.
