Table of Contents
Fetching ...

Safety-Guaranteed Imitation Learning from Nonlinear Model Predictive Control for Spacecraft Close Proximity Operations

Alexander Meinert, Niklas Baldauf, Peter Stadler, Alen Turnwald

Abstract

This paper presents a safety-guaranteed, runtime-efficient imitation learning framework for spacecraft close proximity control. We leverage Control Barrier Functions (CBFs) for safety certificates and Control Lyapunov Functions (CLFs) for stability as unified design principles across data generation, training, and deployment. First, a nonlinear Model Predictive Control (NMPC) expert enforces CBF constraints to provide safe reference trajectories. Second, we train a neural policy with a novel CBF-CLF-informed loss and DAgger-like rollouts with curriculum weighting, promoting data-efficiency and reducing future safety filter interventions. Third, at deployment a lightweight one-step CBF-CLF quadratic program minimally adjusts the learned control input to satisfy hard safety constraints while encouraging stability. We validate the approach for ESA-compliant close proximity operations, including fly-around with a spherical keep-out zone and final approach inside a conical approach corridor, using the Basilisk high-fidelity simulator with nonlinear dynamics and perturbations. Numerical experiments indicate stable convergence to decision points and strict adherence to safety under the filter, with task performance comparable to the NMPC expert while significantly reducing online computation. A runtime analysis demonstrates real-time feasibility on a commercial off-the-shelf processor, supporting onboard deployment for safety-critical on-orbit servicing.

Safety-Guaranteed Imitation Learning from Nonlinear Model Predictive Control for Spacecraft Close Proximity Operations

Abstract

This paper presents a safety-guaranteed, runtime-efficient imitation learning framework for spacecraft close proximity control. We leverage Control Barrier Functions (CBFs) for safety certificates and Control Lyapunov Functions (CLFs) for stability as unified design principles across data generation, training, and deployment. First, a nonlinear Model Predictive Control (NMPC) expert enforces CBF constraints to provide safe reference trajectories. Second, we train a neural policy with a novel CBF-CLF-informed loss and DAgger-like rollouts with curriculum weighting, promoting data-efficiency and reducing future safety filter interventions. Third, at deployment a lightweight one-step CBF-CLF quadratic program minimally adjusts the learned control input to satisfy hard safety constraints while encouraging stability. We validate the approach for ESA-compliant close proximity operations, including fly-around with a spherical keep-out zone and final approach inside a conical approach corridor, using the Basilisk high-fidelity simulator with nonlinear dynamics and perturbations. Numerical experiments indicate stable convergence to decision points and strict adherence to safety under the filter, with task performance comparable to the NMPC expert while significantly reducing online computation. A runtime analysis demonstrates real-time feasibility on a commercial off-the-shelf processor, supporting onboard deployment for safety-critical on-orbit servicing.
Paper Structure (14 sections, 18 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 14 sections, 18 equations, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: Concept of operations for the Close Rendezvous mission phase in accordance with the ESA guidelines on safe CPO ESA2024. The servicer performs two operations requiring forced motion control with active safety regards within the Approach Zone. (1) Spherical fly-around to decision point GO for KOZ in order to align with the target angular momentum vector while avoiding the Keep Out Zone. (2) Final approach to decision point GO for Capture, where entry in KOZ is only permitted if the conical Approach Corridor bounds are satisfied. The radius of the KOZ $r_{\text{KOZ}}$ is the sum of the largest dimensions $l_{\text{s}}$ and $l_{\text{t}}$ of the servicer and target, respectively.
  • Figure 2: Overview of the threefold, safety-guaranteed imitation learning framework, which incorporates safety considerations at multiple stages. (1) Data generation: Safety is integrated during data collection, as the NMPC expert policy enforces CBF constraints within the optimal control problem. (2) Training: A tailored loss function augments the conventional imitation loss with CBF and CLF soft constraints, imposing physical safety and stability priors into the learning process. (3) Deployment: During online deployment, a runtime-efficient CBF-CLF-QP safety filter ensures stability and hard constraint satisfaction.
  • Figure 3: Simulation results and controller ablation study.
  • Figure 4: Impact of loss function on severity of safety filter interventions during deployment for 50 simulation runs.