Table of Contents
Fetching ...

OrbitZoo: Real Orbital Systems Challenges for Reinforcement Learning

Alexandre Oliveira, Katarina Dyreby, Francisco Caldas, Cláudia Soares

TL;DR

OrbitZoo provides a high-fidelity, open-source multi-agent RL environment for orbital dynamics by integrating Orekit-based propagation with PettingZoo-compatible MARL frameworks. It addresses sim-to-real gaps, continuous-time thrust control, partial observability, and safety concerns, validated through Starlink ephemerides and diverse missions including Hohmann transfers, CAM, and GEO constellations. The results show near-optimal policy behavior under realistic perturbations and demonstrate generalization across missions and perturbations, highlighting the platform's potential for robust, autonomous space operations. By enabling scalable, reproducible experiments with real-world validation data, OrbitZoo stands as a practical benchmark for RL in space and a stepping stone toward autonomous STM and constellation management.

Abstract

The increasing number of satellites and orbital debris has made space congestion a critical issue, threatening satellite safety and sustainability. Challenges such as collision avoidance, station-keeping, and orbital maneuvering require advanced techniques to handle dynamic uncertainties and multi-agent interactions. Reinforcement learning (RL) has shown promise in this domain, enabling adaptive, autonomous policies for space operations; however, many existing RL frameworks rely on custom-built environments developed from scratch, which often use simplified models and require significant time to implement and validate the orbital dynamics, limiting their ability to fully capture real-world complexities. To address this, we introduce OrbitZoo, a versatile multi-agent RL environment built on a high-fidelity industry standard library, that enables realistic data generation, supports scenarios like collision avoidance and cooperative maneuvers, and ensures robust and accurate orbital dynamics. The environment is validated against a real satellite constellation, Starlink, achieving a Mean Absolute Percentage Error (MAPE) of 0.16% compared to real-world data. This validation ensures reliability for generating high-fidelity simulations and enabling autonomous and independent satellite operations.

OrbitZoo: Real Orbital Systems Challenges for Reinforcement Learning

TL;DR

OrbitZoo provides a high-fidelity, open-source multi-agent RL environment for orbital dynamics by integrating Orekit-based propagation with PettingZoo-compatible MARL frameworks. It addresses sim-to-real gaps, continuous-time thrust control, partial observability, and safety concerns, validated through Starlink ephemerides and diverse missions including Hohmann transfers, CAM, and GEO constellations. The results show near-optimal policy behavior under realistic perturbations and demonstrate generalization across missions and perturbations, highlighting the platform's potential for robust, autonomous space operations. By enabling scalable, reproducible experiments with real-world validation data, OrbitZoo stands as a practical benchmark for RL in space and a stepping stone toward autonomous STM and constellation management.

Abstract

The increasing number of satellites and orbital debris has made space congestion a critical issue, threatening satellite safety and sustainability. Challenges such as collision avoidance, station-keeping, and orbital maneuvering require advanced techniques to handle dynamic uncertainties and multi-agent interactions. Reinforcement learning (RL) has shown promise in this domain, enabling adaptive, autonomous policies for space operations; however, many existing RL frameworks rely on custom-built environments developed from scratch, which often use simplified models and require significant time to implement and validate the orbital dynamics, limiting their ability to fully capture real-world complexities. To address this, we introduce OrbitZoo, a versatile multi-agent RL environment built on a high-fidelity industry standard library, that enables realistic data generation, supports scenarios like collision avoidance and cooperative maneuvers, and ensures robust and accurate orbital dynamics. The environment is validated against a real satellite constellation, Starlink, achieving a Mean Absolute Percentage Error (MAPE) of 0.16% compared to real-world data. This validation ensures reliability for generating high-fidelity simulations and enabling autonomous and independent satellite operations.

Paper Structure

This paper contains 105 sections, 51 equations, 52 figures, 18 tables.

Figures (52)

  • Figure 1: Frames of different systems on OrbitZoo's interface.
  • Figure 2: The L2 error between the optimal and Experiment 2 maneuvers stays low over long orbits, with most error due to minor inclination shifts. However, agents were trained to minimize equinoctial element differences, not Euclidean distance.
  • Figure 3: Visualization of the initial random constellation and its evolution over nearly 4 days, with the red circle representing the GEO orbit. After this period, the agents exhibit more distinct anomalies while maintaining proximity to the nominal orbit.
  • Figure 4: Residuals between OrbitZoo-propagated trajectories and Starlink ephemeris data for satellite 44748. The residuals remain low over the validation interval, confirming OrbitZoo’s high fidelity.
  • Figure 5: High-level overview of OrbitZoo’s architecture. A JSON file describing the orbital system is provided to OrbitZoo, which then serves as an interface for data generation, single- and multi-agent RL mission development, or orbital dynamics analysis.
  • ...and 47 more figures