Table of Contents
Fetching ...

Deep reinforcement learning-based spacecraft attitude control with pointing keep-out constraint

Juntang Yang, Mohamed Khalil Ben-Larbi

TL;DR

Deep reinforcement learning is applied to spacecraft attitude reorientation under a single pointing keep-out constraint using SAC, incorporating a constraint-aware state representation and a tailored reward to enforce the zone. A two-phase curriculum training strategy improves learning efficiency and constraint handling. Monte Carlo tests (10,000 scenarios) show ~97% success in meeting the targeting while respecting the keep-out, but some failures persist, indicating that reward shaping alone cannot guarantee safety and safe RL approaches are needed. The work demonstrates the potential for real-time, onboard attitude control under pointing constraints and provides a foundation for extending to multiple constraints and shielded RL strategies.

Abstract

This paper implements deep reinforcement learning (DRL) for spacecraft reorientation control with a single pointing keep-out zone. The Soft Actor-Critic (SAC) algorithm is adopted to handle continuous state and action space. A new state representation is designed to explicitly include a compact representation of the attitude constraint zone. The reward function is formulated to achieve the control objective while enforcing the attitude constraint. A curriculum learning approach is used for the agent training. Simulation results demonstrate the effectiveness of the proposed DRL-based method for spacecraft pointing-constrained attitude control.

Deep reinforcement learning-based spacecraft attitude control with pointing keep-out constraint

TL;DR

Deep reinforcement learning is applied to spacecraft attitude reorientation under a single pointing keep-out constraint using SAC, incorporating a constraint-aware state representation and a tailored reward to enforce the zone. A two-phase curriculum training strategy improves learning efficiency and constraint handling. Monte Carlo tests (10,000 scenarios) show ~97% success in meeting the targeting while respecting the keep-out, but some failures persist, indicating that reward shaping alone cannot guarantee safety and safe RL approaches are needed. The work demonstrates the potential for real-time, onboard attitude control under pointing constraints and provides a foundation for extending to multiple constraints and shielded RL strategies.

Abstract

This paper implements deep reinforcement learning (DRL) for spacecraft reorientation control with a single pointing keep-out zone. The Soft Actor-Critic (SAC) algorithm is adopted to handle continuous state and action space. A new state representation is designed to explicitly include a compact representation of the attitude constraint zone. The reward function is formulated to achieve the control objective while enforcing the attitude constraint. A curriculum learning approach is used for the agent training. Simulation results demonstrate the effectiveness of the proposed DRL-based method for spacecraft pointing-constrained attitude control.

Paper Structure

This paper contains 9 sections, 11 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Spacecraft sketch with keep-out zone for telescope.
  • Figure 2: Trace of boresight vector on unit-sphere with agent trained in Phase 1 (without F-zone).
  • Figure 3: Time history of relative attitude, angular velocity, control torque, and $\theta_\text{margin}$ under agent trained in Phase 1 (without F-zone).
  • Figure 4: Trace of boresight vector on unit-sphere with agent trained in Phase 2 (with one F-zone).
  • Figure 5: Time history of relative attitude, angular velocity, control torque, and $\theta_\text{margin}$ under agent trained in Phase 2 (with one F-zone).
  • ...and 3 more figures