Table of Contents
Fetching ...

Stability Analysis of Deep Reinforcement Learning for Multi-Agent Inspection in a Terrestrial Testbed

Henry Lei, Zachary S. Lippay, Anonto Zaman, Joshua Aurand, Amin Maghareh, Sean Phillips

TL;DR

The paper addresses robust autonomous multi-agent satellite inspection under modeling uncertainties and limited communications. It proposes a hierarchical DRL framework with a high-level guidance planner and a low-level motion controller, augmented by runtime assurance using barrier-based methods, evaluated on the LINCS platform across four fidelity levels. Key contributions include a formal problem formulation with $CWH$ dynamics and $J_2$ perturbations, an integrated RL-based guidance and control architecture, and comprehensive experiments showing high task completion rates with controlled degradation in time and distance as fidelity increases. These results demonstrate the potential for scalable, robust autonomous satellite operations and provide a practical path for bridging the sim-to-real gap in space missions.

Abstract

The design and deployment of autonomous systems for space missions require robust solutions to navigate strict reliability constraints, extended operational duration, and communication challenges. This study evaluates the stability and performance of a hierarchical deep reinforcement learning (DRL) framework designed for multi-agent satellite inspection tasks. The proposed framework integrates a high-level guidance policy with a low-level motion controller, enabling scalable task allocation and efficient trajectory execution. Experiments conducted on the Local Intelligent Network of Collaborative Satellites (LINCS) testbed assess the framework's performance under varying levels of fidelity, from simulated environments to a cyber-physical testbed. Key metrics, including task completion rate, distance traveled, and fuel consumption, highlight the framework's robustness and adaptability despite real-world uncertainties such as sensor noise, dynamic perturbations, and runtime assurance (RTA) constraints. The results demonstrate that the hierarchical controller effectively bridges the sim-to-real gap, maintaining high task completion rates while adapting to the complexities of real-world environments. These findings validate the framework's potential for enabling autonomous satellite operations in future space missions.

Stability Analysis of Deep Reinforcement Learning for Multi-Agent Inspection in a Terrestrial Testbed

TL;DR

The paper addresses robust autonomous multi-agent satellite inspection under modeling uncertainties and limited communications. It proposes a hierarchical DRL framework with a high-level guidance planner and a low-level motion controller, augmented by runtime assurance using barrier-based methods, evaluated on the LINCS platform across four fidelity levels. Key contributions include a formal problem formulation with dynamics and perturbations, an integrated RL-based guidance and control architecture, and comprehensive experiments showing high task completion rates with controlled degradation in time and distance as fidelity increases. These results demonstrate the potential for scalable, robust autonomous satellite operations and provide a practical path for bridging the sim-to-real gap in space missions.

Abstract

The design and deployment of autonomous systems for space missions require robust solutions to navigate strict reliability constraints, extended operational duration, and communication challenges. This study evaluates the stability and performance of a hierarchical deep reinforcement learning (DRL) framework designed for multi-agent satellite inspection tasks. The proposed framework integrates a high-level guidance policy with a low-level motion controller, enabling scalable task allocation and efficient trajectory execution. Experiments conducted on the Local Intelligent Network of Collaborative Satellites (LINCS) testbed assess the framework's performance under varying levels of fidelity, from simulated environments to a cyber-physical testbed. Key metrics, including task completion rate, distance traveled, and fuel consumption, highlight the framework's robustness and adaptability despite real-world uncertainties such as sensor noise, dynamic perturbations, and runtime assurance (RTA) constraints. The results demonstrate that the hierarchical controller effectively bridges the sim-to-real gap, maintaining high task completion rates while adapting to the complexities of real-world environments. These findings validate the framework's potential for enabling autonomous satellite operations in future space missions.

Paper Structure

This paper contains 30 sections, 25 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: The left diagram shows simulation workflow with RTA enforcement in the LINCS Lab. The right diagram shows an analogous version for LINCS Cyber-Physical emulation using quadrotor UAVs.
  • Figure 2: Example trajectory for the hierarchical controller as generated in the Hierarchical Inspection Environment. The top figure illustrates trajectories specified through relative position of deputies over time. This is overlayed on the graph of inspection points. The bottom-left figure describes HL guidance specified to the index of inspection point visitation over time. The bottom-right figure illustrates distance from goal as described by the LL controller.
  • Figure 3: Trajectory data for Experiment 2. The top-left figure illustrates trajectories specified through relative position of deputies over time. The top-right shows HL guidance specified to the index of inspection point visitation over time. The bottom-left illustrates distance from goal as described by the LL controller. The bottom-right shows RTA activation data over time.
  • Figure 4: Trajectory data for Experiment 3. The top-left figure illustrates trajectories specified through relative position of deputies over time. The top-right shows HL guidance specified to the index of inspection point visitation over time. The bottom-left illustrates distance from goal as described by the LL motion controller. The bottom-right shows RTA activation data over time. For this trial, the velocity limit was exceeded multiple times (bolded) while the interagent distance constraint was not triggered.