Table of Contents
Fetching ...

ASTREA: Introducing Agentic Intelligence for Orbital Thermal Autonomy

Alejandro D. Mousist

TL;DR

ASTREA tackles the challenge of enabling autonomous spacecraft operations by fusing a resource-constrained LLM supervisor with a reinforcement learning thermal controller in an asynchronous architecture suitable for flight hardware. The approach assigns the LLM to modulate the SAC entropy coefficient \(\alpha\) based on episode summaries, while the RL agent handles real-time thermal regulation on the edge. Ground tests show substantial improvements in episode duration and thermal compliance, but on-orbit results reveal latency-induced mismatches that can be mitigated by aligning the LLM reasoning window with orbital dynamics, particularly in an orbit-cycle configuration. The study demonstrates the feasibility and practicality of agentic supervision for space autonomy, while identifying latency, grounding, and the need for accelerators as key directions for future work with real-world impact on autonomous space missions.

Abstract

This paper presents ASTREA, the first agentic system executed on flight-heritage hardware (TRL 9) for autonomous spacecraft operations, with on-orbit operation aboard the International Space Station (ISS). Using thermal control as a representative use case, we integrate a resource-constrained Large Language Model (LLM) agent with a reinforcement learning controller in an asynchronous architecture tailored for space-qualified platforms. Ground experiments show that LLM-guided supervision improves thermal stability and reduces violations, confirming the feasibility of combining semantic reasoning with adaptive control under hardware constraints. On-orbit validation aboard the ISS initially faced challenges due to inference latency misaligned with the rapid thermal cycles of Low Earth Orbit (LEO) satellites. Synchronization with the orbit length successfully surpassed the baseline with reduced violations, extended episode durations, and improved CPU utilization. These findings demonstrate the potential for scalable agentic supervision architectures in future autonomous spacecraft.

ASTREA: Introducing Agentic Intelligence for Orbital Thermal Autonomy

TL;DR

ASTREA tackles the challenge of enabling autonomous spacecraft operations by fusing a resource-constrained LLM supervisor with a reinforcement learning thermal controller in an asynchronous architecture suitable for flight hardware. The approach assigns the LLM to modulate the SAC entropy coefficient based on episode summaries, while the RL agent handles real-time thermal regulation on the edge. Ground tests show substantial improvements in episode duration and thermal compliance, but on-orbit results reveal latency-induced mismatches that can be mitigated by aligning the LLM reasoning window with orbital dynamics, particularly in an orbit-cycle configuration. The study demonstrates the feasibility and practicality of agentic supervision for space autonomy, while identifying latency, grounding, and the need for accelerators as key directions for future work with real-world impact on autonomous space missions.

Abstract

This paper presents ASTREA, the first agentic system executed on flight-heritage hardware (TRL 9) for autonomous spacecraft operations, with on-orbit operation aboard the International Space Station (ISS). Using thermal control as a representative use case, we integrate a resource-constrained Large Language Model (LLM) agent with a reinforcement learning controller in an asynchronous architecture tailored for space-qualified platforms. Ground experiments show that LLM-guided supervision improves thermal stability and reduces violations, confirming the feasibility of combining semantic reasoning with adaptive control under hardware constraints. On-orbit validation aboard the ISS initially faced challenges due to inference latency misaligned with the rapid thermal cycles of Low Earth Orbit (LEO) satellites. Synchronization with the orbit length successfully surpassed the baseline with reduced violations, extended episode durations, and improved CPU utilization. These findings demonstrate the potential for scalable agentic supervision architectures in future autonomous spacecraft.

Paper Structure

This paper contains 25 sections, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Schematic diagram of the experiment. The RL Agent send the iteration count, how many of that iterations have been near the threshold ($<1~^\circ\mathrm{C}$) and the average thermal gradient of the last episode. The LLM Agent answers asynchronously the new alpha suggestion
  • Figure 2: System prompt used by the LLM-Agent to configure the behavior of the language model
  • Figure 3: User prompt used by the LLM-Agent to ask the language model for $\alpha$ updates
  • Figure 4: Core-0 is reserved for the agentic thermal control system and the other 15 cores are under heavy load and managed by the agents.
  • Figure 5: Only the RL-agent operates in the core-0 for performing thermal control. As in the agentic version, the load is applied on the other cores.
  • ...and 2 more figures