End-to-End Imitation Learning for Optimal Asteroid Proximity Operations
Patrick Quinn, George Nehma, Madhur Tiwari
TL;DR
This work tackles autonomous proximity operations near asteroids under long communication delays and limited onboard compute. It introduces an end-to-end imitation-learning controller that uses raw lidar sensor data to generate near-optimal thrust commands and a hybrid MPC-guided imitation learning controller that serves as runtime assurance. The authors generate a large dataset by simulating ~400 MPC-driven transits around the asteroid Kleopatra, adding disturbances to expose recovery trajectories, and augmenting with synthetic lidar data; they train a CNN-LSTM-MLP network to reproduce MPC actions with RMSE loss. Forward testing reveals that pure imitation learning can fail and collide, but the hybrid approach achieves near-MPC trajectories for about 70% of steps and dramatically reduces inference time and energy consumption (inference time from ~0.473 s to ~0.053 s; CPU energy ~29.2% and GPU energy ~21.6%). The work demonstrates the practicality of end-to-end learning with runtime assurance for space GNC and highlights the importance of diverse training data to mitigate covariate shift.
Abstract
Controlling spacecraft near asteroids in deep space comes with many challenges. The delays involved necessitate heavy usage of limited onboard computation resources while fuel efficiency remains a priority to support the long loiter times needed for gathering data. Additionally, the difficulty of state determination due to the lack of traditional reference systems requires a guidance, navigation, and control (GNC) pipeline that ideally is both computationally and fuel-efficient, and that incorporates a robust state determination system. In this paper, we propose an end-to-end algorithm utilizing neural networks to generate near-optimal control commands from raw sensor data, as well as a hybrid model predictive control (MPC) guided imitation learning controller delivering improvements in computational efficiency over a traditional MPC controller.
