Table of Contents
Fetching ...

Can Large Language Models Help Developers with Robotic Finite State Machine Modification?

Xiangyu Robin Gan, Yuxin Ray Song, Nick Walker, Maya Cakmak

TL;DR

The paper investigates whether large language models (LLMs) can assist developers in editing robotic finite state machines (FSMs) from natural language. By introducing ChatFSM, an LLM-driven agent augmented with Retrieval-Augmented Generation, the authors demonstrate the feasibility of language-guided FSM modification on a real-world RoboCup@Home dataset, including a multi-agent preprocessing and validation pipeline. The work shows that LLMs can reproduce structural FSM changes across multiple files, with some limitations around contextual depth and implementation details, and discusses how richer context improves accuracy. Overall, the results indicate that LLM-assisted FSM modification can reduce manual effort and accelerate robotic software updates, while outlining clear directions for broader evaluation and enhancement.

Abstract

Finite state machines (FSMs) are widely used to manage robot behavior logic, particularly in real-world applications that require a high degree of reliability and structure. However, traditional manual FSM design and modification processes can be time-consuming and error-prone. We propose that large language models (LLMs) can assist developers in editing FSM code for real-world robotic use cases. LLMs, with their ability to use context and process natural language, offer a solution for FSM modification with high correctness, allowing developers to update complex control logic through natural language instructions. Our approach leverages few-shot prompting and language-guided code generation to reduce the amount of time it takes to edit an FSM. To validate this approach, we evaluate it on a real-world robotics dataset, demonstrating its effectiveness in practical scenarios.

Can Large Language Models Help Developers with Robotic Finite State Machine Modification?

TL;DR

The paper investigates whether large language models (LLMs) can assist developers in editing robotic finite state machines (FSMs) from natural language. By introducing ChatFSM, an LLM-driven agent augmented with Retrieval-Augmented Generation, the authors demonstrate the feasibility of language-guided FSM modification on a real-world RoboCup@Home dataset, including a multi-agent preprocessing and validation pipeline. The work shows that LLMs can reproduce structural FSM changes across multiple files, with some limitations around contextual depth and implementation details, and discusses how richer context improves accuracy. Overall, the results indicate that LLM-assisted FSM modification can reduce manual effort and accelerate robotic software updates, while outlining clear directions for broader evaluation and enhancement.

Abstract

Finite state machines (FSMs) are widely used to manage robot behavior logic, particularly in real-world applications that require a high degree of reliability and structure. However, traditional manual FSM design and modification processes can be time-consuming and error-prone. We propose that large language models (LLMs) can assist developers in editing FSM code for real-world robotic use cases. LLMs, with their ability to use context and process natural language, offer a solution for FSM modification with high correctness, allowing developers to update complex control logic through natural language instructions. Our approach leverages few-shot prompting and language-guided code generation to reduce the amount of time it takes to edit an FSM. To validate this approach, we evaluate it on a real-world robotics dataset, demonstrating its effectiveness in practical scenarios.

Paper Structure

This paper contains 23 sections, 12 figures, 3 tables.

Figures (12)

  • Figure 1: Robotics developers can use natural language to modify finite state machines through a chatbot.
  • Figure 2: A FSM as a directed graph $G = (S, T)$. The vertices $S$ represent states ($S^I, S_1, S_2, S^X_0$), with $S^I$ as the initial state and $S^X_0$ as a sink state indicating the FSM's termination. Each state has labeled transitions $T$ (e.g., $O^{S_n}_i$) mapping to outcomes leading to other states or looping within the same state, demonstrating how transitions operate within the FSM structure.
  • Figure 3: A FSM representing a robot's navigation task. The states include "Start," "Navigate," "Open Door," "Enter Room," and "Destination." Transitions between these states are driven by specific conditions.
  • Figure 4: ChatFSM Interface, FSM Visualization is on the left, Chatbot input interface is on the right.
  • Figure 5: Data processing pipeline consists of multiple stages involving LLM invocations. These stages analyze the dataset by comparing pre-modification and post-modification code snippets to identify differences, extract necessary FSM information, and construct input prompts for the evaluation pipeline.
  • ...and 7 more figures