Table of Contents
Fetching ...

Deep Reinforcement Learning for Scalable Multiagent Spacecraft Inspection

Kyle Dunlap, Nathaniel Hamilton, Kerianne L. Hobbs

TL;DR

This work addresses the challenge of coordinating multiple autonomous spacecraft for inspection tasks as fleet size grows. It introduces scalable, fixed-size observation spaces that encode information about other agents in a lidar-like fashion, enabling a single neural policy to handle varying numbers of deputies. Using 6-DoF dynamics, Hill's frame, and rigorous safety via ASIF/CBFs, the authors train cooperative deputies to inspect a passive chief while minimizing fuel use and avoiding collisions. Empirical results show that the Points-Dist scalable observation space delivers the best performance, with robust transfer to scenarios involving more agents, highlighting the method's potential for scalable autonomous space operations and beyond.

Abstract

As the number of spacecraft in orbit continues to increase, it is becoming more challenging for human operators to manage each mission. As a result, autonomous control methods are needed to reduce this burden on operators. One method of autonomous control is Reinforcement Learning (RL), which has proven to have great success across a variety of complex tasks. For missions with multiple controlled spacecraft, or agents, it is critical for the agents to communicate and have knowledge of each other, where this information is typically given to the Neural Network Controller (NNC) as an input observation. As the number of spacecraft used for the mission increases or decreases, rather than modifying the size of the observation, this paper develops a scalable observation space that uses a constant observation size to give information on all of the other agents. This approach is similar to a lidar sensor, where determines ranges of other objects in the environment. This observation space is applied to a spacecraft inspection task, where RL is used to train multiple deputy spacecraft to cooperate and inspect a passive chief spacecraft. It is expected that the scalable observation space will allow the agents to learn to complete the task more efficiently compared to a baseline solution where no information is communicated between agents.

Deep Reinforcement Learning for Scalable Multiagent Spacecraft Inspection

TL;DR

This work addresses the challenge of coordinating multiple autonomous spacecraft for inspection tasks as fleet size grows. It introduces scalable, fixed-size observation spaces that encode information about other agents in a lidar-like fashion, enabling a single neural policy to handle varying numbers of deputies. Using 6-DoF dynamics, Hill's frame, and rigorous safety via ASIF/CBFs, the authors train cooperative deputies to inspect a passive chief while minimizing fuel use and avoiding collisions. Empirical results show that the Points-Dist scalable observation space delivers the best performance, with robust transfer to scenarios involving more agents, highlighting the method's potential for scalable autonomous space operations and beyond.

Abstract

As the number of spacecraft in orbit continues to increase, it is becoming more challenging for human operators to manage each mission. As a result, autonomous control methods are needed to reduce this burden on operators. One method of autonomous control is Reinforcement Learning (RL), which has proven to have great success across a variety of complex tasks. For missions with multiple controlled spacecraft, or agents, it is critical for the agents to communicate and have knowledge of each other, where this information is typically given to the Neural Network Controller (NNC) as an input observation. As the number of spacecraft used for the mission increases or decreases, rather than modifying the size of the observation, this paper develops a scalable observation space that uses a constant observation size to give information on all of the other agents. This approach is similar to a lidar sensor, where determines ranges of other objects in the environment. This observation space is applied to a spacecraft inspection task, where RL is used to train multiple deputy spacecraft to cooperate and inspect a passive chief spacecraft. It is expected that the scalable observation space will allow the agents to learn to complete the task more efficiently compared to a baseline solution where no information is communicated between agents.

Paper Structure

This paper contains 17 sections, 39 equations, 8 figures.

Figures (8)

  • Figure 1: RL feedback control loop hamilton2023ablation.
  • Figure 2: Feedback control system with RTA dunlap2024run.
  • Figure 3: Hill's reference frame dunlap2024run.
  • Figure 4: Observation space comparison.
  • Figure 5: RL agent performance during training. The dark line represents the IQM for each metric, and the shaded regions represent the 95% confidence intervals.
  • ...and 3 more figures