Deep reinforcement learning-based spacecraft attitude control with pointing keep-out constraint
Juntang Yang, Mohamed Khalil Ben-Larbi
TL;DR
Deep reinforcement learning is applied to spacecraft attitude reorientation under a single pointing keep-out constraint using SAC, incorporating a constraint-aware state representation and a tailored reward to enforce the zone. A two-phase curriculum training strategy improves learning efficiency and constraint handling. Monte Carlo tests (10,000 scenarios) show ~97% success in meeting the targeting while respecting the keep-out, but some failures persist, indicating that reward shaping alone cannot guarantee safety and safe RL approaches are needed. The work demonstrates the potential for real-time, onboard attitude control under pointing constraints and provides a foundation for extending to multiple constraints and shielded RL strategies.
Abstract
This paper implements deep reinforcement learning (DRL) for spacecraft reorientation control with a single pointing keep-out zone. The Soft Actor-Critic (SAC) algorithm is adopted to handle continuous state and action space. A new state representation is designed to explicitly include a compact representation of the attitude constraint zone. The reward function is formulated to achieve the control objective while enforcing the attitude constraint. A curriculum learning approach is used for the agent training. Simulation results demonstrate the effectiveness of the proposed DRL-based method for spacecraft pointing-constrained attitude control.
