Table of Contents
Fetching ...

GUIDE: Guided Updates for In-context Decision Evolution in LLM-Driven Spacecraft Operations

Alejandro Carrasco, Mariko Storey-Matsutani, Victor Rodriguez-Fernandez, Richard Linares

Abstract

Large language models (LLMs) have been proposed as supervisory agents for spacecraft operations, but existing approaches rely on static prompting and do not improve across repeated executions. We introduce \textsc{GUIDE}, a non-parametric policy improvement framework that enables cross-episode adaptation without weight updates by evolving a structured, state-conditioned playbook of natural-language decision rules. A lightweight acting model performs real-time control, while offline reflection updates the playbook from prior trajectories. Evaluated on an adversarial orbital interception task in the Kerbal Space Program Differential Games environment, GUIDE's evolution consistently outperforms static baselines. Results indicate that context evolution in LLM agents functions as policy search over structured decision rules in real-time closed-loop spacecraft interaction.

GUIDE: Guided Updates for In-context Decision Evolution in LLM-Driven Spacecraft Operations

Abstract

Large language models (LLMs) have been proposed as supervisory agents for spacecraft operations, but existing approaches rely on static prompting and do not improve across repeated executions. We introduce \textsc{GUIDE}, a non-parametric policy improvement framework that enables cross-episode adaptation without weight updates by evolving a structured, state-conditioned playbook of natural-language decision rules. A lightweight acting model performs real-time control, while offline reflection updates the playbook from prior trajectories. Evaluated on an adversarial orbital interception task in the Kerbal Space Program Differential Games environment, GUIDE's evolution consistently outperforms static baselines. Results indicate that context evolution in LLM agents functions as policy search over structured decision rules in real-time closed-loop spacecraft interaction.

Paper Structure

This paper contains 16 sections, 4 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Closed-loop control with cross-episode context adaptation. A fixed acting model executes $\pi(a\mid s,P_k)$ online, while offline reflection updates the playbook via add/update/remove and UCB selects improved versions.
  • Figure 2: Evolution comparison.
  • Figure 3: Hill-frame (RTN) trajectories for LG7 (v0 and best).
  • Figure 4: Per-version mean composite score (logarithmic scale, $\pm$1 std) for all four scenarios. Red bars denote the best-performing version.
  • Figure 5: GUIDE playbook bullet schema. The conditions block is a symbolic guard: the bullet text is injected into the LLM prompt only when all conditions evaluate to true on the current observation.
  • ...and 3 more figures