Table of Contents
Fetching ...

Fine-tuning LLMs for Autonomous Spacecraft Control: A Case Study Using Kerbal Space Program

Alejandro Carrasco, Victor Rodriguez-Fernandez, Richard Linares

TL;DR

This study explores the use of fine-tuned Large Language Models for autonomous spacecraft control, using the Kerbal Space Program Differential Games suite (KSPDG) as a testing environment and demonstrates how these models can effectively control spacecraft using language-based inputs and outputs.

Abstract

Recent trends are emerging in the use of Large Language Models (LLMs) as autonomous agents that take actions based on the content of the user text prompt. This study explores the use of fine-tuned Large Language Models (LLMs) for autonomous spacecraft control, using the Kerbal Space Program Differential Games suite (KSPDG) as a testing environment. Traditional Reinforcement Learning (RL) approaches face limitations in this domain due to insufficient simulation capabilities and data. By leveraging LLMs, specifically fine-tuning models like GPT-3.5 and LLaMA, we demonstrate how these models can effectively control spacecraft using language-based inputs and outputs. Our approach integrates real-time mission telemetry into textual prompts processed by the LLM, which then generate control actions via an agent. The results open a discussion about the potential of LLMs for space operations beyond their nominal use for text-related tasks. Future work aims to expand this methodology to other space control tasks and evaluate the performance of different LLM families. The code is available at this URL: \texttt{https://github.com/ARCLab-MIT/kspdg}.

Fine-tuning LLMs for Autonomous Spacecraft Control: A Case Study Using Kerbal Space Program

TL;DR

This study explores the use of fine-tuned Large Language Models for autonomous spacecraft control, using the Kerbal Space Program Differential Games suite (KSPDG) as a testing environment and demonstrates how these models can effectively control spacecraft using language-based inputs and outputs.

Abstract

Recent trends are emerging in the use of Large Language Models (LLMs) as autonomous agents that take actions based on the content of the user text prompt. This study explores the use of fine-tuned Large Language Models (LLMs) for autonomous spacecraft control, using the Kerbal Space Program Differential Games suite (KSPDG) as a testing environment. Traditional Reinforcement Learning (RL) approaches face limitations in this domain due to insufficient simulation capabilities and data. By leveraging LLMs, specifically fine-tuning models like GPT-3.5 and LLaMA, we demonstrate how these models can effectively control spacecraft using language-based inputs and outputs. Our approach integrates real-time mission telemetry into textual prompts processed by the LLM, which then generate control actions via an agent. The results open a discussion about the potential of LLMs for space operations beyond their nominal use for text-related tasks. Future work aims to expand this methodology to other space control tasks and evaluate the performance of different LLM families. The code is available at this URL: \texttt{https://github.com/ARCLab-MIT/kspdg}.
Paper Structure (3 sections, 3 figures, 1 table)

This paper contains 3 sections, 3 figures, 1 table.

Figures (3)

  • Figure 1: Overview of the proposed approach to use a fine-tuned LLM (e.g. ChatGPT, LLaMA) as an autonomous spacecraft operator that gets, as user prompt, the current status of the mission from the KSDPG simulation environment (i.e., the state or observation in the RL jargon), and replies with a reasoned action to carry out, expressed as a function calling with the specific throttle vector and the textual justification behind the action.
  • Figure 2: Diagram of the data generation process for fine-tuning the model. The sequence is as follows: (1) The orbit generator is invoked to create a new orbit. (2) The new orbit is saved into KSP. (3) The navball agent is activated to navigate the orbit and generate logs. (4) The logs are saved by the orbit generator. (5) After sufficient runs (e.g., 100), the script data parser converts the logs into text suitable for LLM processing.
  • Figure 3: This 3D plot depicts the trajectories of the best-performing fine-tuned models for GPT and LLaMA, along with the evader's path. Due to an incorrect hint, the GPT model deviates significantly after overshooting its target, while the LLaMA model maintains a closer trajectory to the evader.