Table of Contents
Fetching ...

Learning-Aided Control of Robotic Tether-Net with Maneuverable Nodes to Capture Large Space Debris

Achira Boonrath, Feng Liu, Elenora M. Botta, Souma Chowdhury

TL;DR

This work tackles active debris removal with a maneuverable tether-net by framing it as a hierarchical control problem. A policy-gradient based centralized planner computes final MU end-points from the debris state, while decentralized PID controllers execute the trajectories under noisy sensing. A recurrent surrogate predicts capture outcomes to accelerate training, and PPO-based RL optimizes MU placements to maximize capture success and minimize fuel, demonstrated on 4-MU and 8-MU designs with and without closing mechanisms. High-fidelity simulations show notable fuel reductions and 100% capture success across varied target positions, highlighting the practical potential of learning-aided tether-net systems for large space debris removal.

Abstract

Maneuverable tether-net systems launched from an unmanned spacecraft offer a promising solution for the active removal of large space debris. Guaranteeing the successful capture of such space debris is dependent on the ability to reliably maneuver the tether-net system -- a flexible, many-DoF (thus complex) system -- for a wide range of launch scenarios. Here, scenarios are defined by the relative location of the debris with respect to the chaser spacecraft. This paper represents and solves this problem as a hierarchically decentralized implementation of robotic trajectory planning and control and demonstrates the effectiveness of the approach when applied to two different tether-net systems, with 4 and 8 maneuverable units (MUs), respectively. Reinforcement learning (policy gradient) is used to design the centralized trajectory planner that, based on the relative location of the target debris at the launch of the net, computes the final aiming positions of each MU, from which their trajectory can be derived. Each MU then seeks to follow its assigned trajectory by using a decentralized PID controller that outputs the MU's thrust vector and is informed by noisy sensor feedback (for realism) of its relative location. System performance is assessed in terms of capture success and overall fuel consumption by the MUs. Reward shaping and surrogate models are used to respectively guide and speed up the RL process. Simulation-based experiments show that this approach allows the successful capture of debris at fuel costs that are notably lower than nominal baselines, including in scenarios where the debris is significantly off-centered compared to the approaching chaser spacecraft.

Learning-Aided Control of Robotic Tether-Net with Maneuverable Nodes to Capture Large Space Debris

TL;DR

This work tackles active debris removal with a maneuverable tether-net by framing it as a hierarchical control problem. A policy-gradient based centralized planner computes final MU end-points from the debris state, while decentralized PID controllers execute the trajectories under noisy sensing. A recurrent surrogate predicts capture outcomes to accelerate training, and PPO-based RL optimizes MU placements to maximize capture success and minimize fuel, demonstrated on 4-MU and 8-MU designs with and without closing mechanisms. High-fidelity simulations show notable fuel reductions and 100% capture success across varied target positions, highlighting the practical potential of learning-aided tether-net systems for large space debris removal.

Abstract

Maneuverable tether-net systems launched from an unmanned spacecraft offer a promising solution for the active removal of large space debris. Guaranteeing the successful capture of such space debris is dependent on the ability to reliably maneuver the tether-net system -- a flexible, many-DoF (thus complex) system -- for a wide range of launch scenarios. Here, scenarios are defined by the relative location of the debris with respect to the chaser spacecraft. This paper represents and solves this problem as a hierarchically decentralized implementation of robotic trajectory planning and control and demonstrates the effectiveness of the approach when applied to two different tether-net systems, with 4 and 8 maneuverable units (MUs), respectively. Reinforcement learning (policy gradient) is used to design the centralized trajectory planner that, based on the relative location of the target debris at the launch of the net, computes the final aiming positions of each MU, from which their trajectory can be derived. Each MU then seeks to follow its assigned trajectory by using a decentralized PID controller that outputs the MU's thrust vector and is informed by noisy sensor feedback (for realism) of its relative location. System performance is assessed in terms of capture success and overall fuel consumption by the MUs. Reward shaping and surrogate models are used to respectively guide and speed up the RL process. Simulation-based experiments show that this approach allows the successful capture of debris at fuel costs that are notably lower than nominal baselines, including in scenarios where the debris is significantly off-centered compared to the approaching chaser spacecraft.
Paper Structure (12 sections, 6 equations, 8 figures, 3 tables)

This paper contains 12 sections, 6 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Representation of the 8 MUs Tether-net System
  • Figure 2: PID Controller Design
  • Figure 3: Framework with Simualor, RNN and RL
  • Figure 4: L2-Norm of MUs' position error (left Y-axis) and distance of net from target (right Y-axis), over time
  • Figure 5: Reward Averaged Over 32 Episodes
  • ...and 3 more figures