OrbitZoo: Real Orbital Systems Challenges for Reinforcement Learning

Alexandre Oliveira; Katarina Dyreby; Francisco Caldas; Cláudia Soares

OrbitZoo: Real Orbital Systems Challenges for Reinforcement Learning

Alexandre Oliveira, Katarina Dyreby, Francisco Caldas, Cláudia Soares

TL;DR

OrbitZoo provides a high-fidelity, open-source multi-agent RL environment for orbital dynamics by integrating Orekit-based propagation with PettingZoo-compatible MARL frameworks. It addresses sim-to-real gaps, continuous-time thrust control, partial observability, and safety concerns, validated through Starlink ephemerides and diverse missions including Hohmann transfers, CAM, and GEO constellations. The results show near-optimal policy behavior under realistic perturbations and demonstrate generalization across missions and perturbations, highlighting the platform's potential for robust, autonomous space operations. By enabling scalable, reproducible experiments with real-world validation data, OrbitZoo stands as a practical benchmark for RL in space and a stepping stone toward autonomous STM and constellation management.

Abstract

The increasing number of satellites and orbital debris has made space congestion a critical issue, threatening satellite safety and sustainability. Challenges such as collision avoidance, station-keeping, and orbital maneuvering require advanced techniques to handle dynamic uncertainties and multi-agent interactions. Reinforcement learning (RL) has shown promise in this domain, enabling adaptive, autonomous policies for space operations; however, many existing RL frameworks rely on custom-built environments developed from scratch, which often use simplified models and require significant time to implement and validate the orbital dynamics, limiting their ability to fully capture real-world complexities. To address this, we introduce OrbitZoo, a versatile multi-agent RL environment built on a high-fidelity industry standard library, that enables realistic data generation, supports scenarios like collision avoidance and cooperative maneuvers, and ensures robust and accurate orbital dynamics. The environment is validated against a real satellite constellation, Starlink, achieving a Mean Absolute Percentage Error (MAPE) of 0.16% compared to real-world data. This validation ensures reliability for generating high-fidelity simulations and enabling autonomous and independent satellite operations.

OrbitZoo: Real Orbital Systems Challenges for Reinforcement Learning

TL;DR

Abstract

OrbitZoo: Real Orbital Systems Challenges for Reinforcement Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (52)