Autonomous Planning In-space Assembly Reinforcement-learning free-flYer (APIARY) International Space Station Astrobee Testing

Samantha Chapin; Kenneth Stewart; Roxana Leontie; Carl Glen Henshaw

Autonomous Planning In-space Assembly Reinforcement-learning free-flYer (APIARY) International Space Station Astrobee Testing

Samantha Chapin, Kenneth Stewart, Roxana Leontie, Carl Glen Henshaw

TL;DR

The paper addresses the challenge of autonomous control for free-flying space robots by demonstrating a reinforcement-learning policy trained in NVIDIA Omniverse Isaac Lab to operate NASA's Astrobee on the ISS. Using 6-DOF control with PPO, the policy is trained with randomized goals and mass variations to bridge the sim-to-real gap and validated across Omniverse simulations, Gazebo-based simulations, Granite Lab hardware, and actual ISS flight. The results show the RL approach can perform basic maneuvers in zero-G, with safety mechanisms allowing fallback to a baseline controller, and document both performance gaps and robust behavior. This work demonstrates the feasibility of RL-driven autonomy for space robotics, outlines a rapid, parallel-simulation-driven development pathway, and highlights future directions toward more complex tasks and ISAM-oriented AI&T workflows.

Abstract

The US Naval Research Laboratory's (NRL's) Autonomous Planning In-space Assembly Reinforcement-learning free-flYer (APIARY) experiment pioneers the use of reinforcement learning (RL) for control of free-flying robots in the zero-gravity (zero-G) environment of space. On Tuesday, May 27th 2025 the APIARY team conducted the first ever, to our knowledge, RL control of a free-flyer in space using the NASA Astrobee robot on-board the International Space Station (ISS). A robust 6-degrees of freedom (DOF) control policy was trained using an actor-critic Proximal Policy Optimization (PPO) network within the NVIDIA Isaac Lab simulation environment, randomizing over goal poses and mass distributions to enhance robustness. This paper details the simulation testing, ground testing, and flight validation of this experiment. This on-orbit demonstration validates the transformative potential of RL for improving robotic autonomy, enabling rapid development and deployment (in minutes to hours) of tailored behaviors for space exploration, logistics, and real-time mission needs.

Autonomous Planning In-space Assembly Reinforcement-learning free-flYer (APIARY) International Space Station Astrobee Testing

TL;DR

Abstract

Autonomous Planning In-space Assembly Reinforcement-learning free-flYer (APIARY) International Space Station Astrobee Testing

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)