Unified Control Framework for Real-Time Interception and Obstacle Avoidance of Fast-Moving Objects with Diffusion Variational Autoencoder
Apan Dastider, Hao Fang, Mingjie Lin
TL;DR
This work tackles real-time interception of fast-moving targets by a 7-DoF robotic arm in cluttered, dynamic environments. It introduces a unified control framework that combines a diffusion-based variational autoencoder to map high-dimensional state data into a 2D latent manifold $M_Z$, an offline densely connected graph for shortest-path routing, and an extended Kalman filter (EKF) for real-time object tracking to drive control. The key contributions are the Diffusion Variational Autoencoder (D-VAE) for robust 2D embedding, the latent-space graph for fast planning, and EKF-based target estimation validated on both simulation and a real 7-DoF Panda arm, demonstrating reliable obstacle avoidance and high interception accuracy. This approach reduces planning latency and enhances robustness in dynamic environments, with potential for generalization to more complex manipulators and perception pipelines.
Abstract
Real-time interception of fast-moving objects by robotic arms in dynamic environments poses a formidable challenge due to the need for rapid reaction times, often within milliseconds, amidst dynamic obstacles. This paper introduces a unified control framework to address the above challenge by simultaneously intercepting dynamic objects and avoiding moving obstacles. Central to our approach is using diffusion-based variational autoencoder for motion planning to perform both object interception and obstacle avoidance. We begin by encoding the high-dimensional temporal information from streaming events into a two-dimensional latent manifold, enabling the discrimination between safe and colliding trajectories, culminating in the construction of an offline densely connected trajectory graph. Subsequently, we employ an extended Kalman filter to achieve precise real-time tracking of the moving object. Leveraging a graph-traversing strategy on the established offline dense graph, we generate encoded robotic motor control commands. Finally, we decode these commands to enable real-time motion of robotic motors, ensuring effective obstacle avoidance and high interception accuracy of fast-moving objects. Experimental validation on both computer simulations and autonomous 7-DoF robotic arms demonstrates the efficacy of our proposed framework. Results indicate the capability of the robotic manipulator to navigate around multiple obstacles of varying sizes and shapes while successfully intercepting fast-moving objects thrown from different angles by hand. Complete video demonstrations of our experiments can be found in https://sites.google.com/view/multirobotskill/home.
