Table of Contents
Fetching ...

Deployment of Large Language Models to Control Mobile Robots at the Edge

Pascal Sikorski, Leendert Schrader, Kaleb Yu, Lucy Billadeau, Jinka Meenakshi, Naveena Mutharasan, Flavio Esposito, Hadi AliAkbarpour, Madi Babaiasl

TL;DR

Results show that GPT-4-Turbo delivers superior performance in interpreting and executing complex commands accurately, whereas LLaMA 2 exhibits significant limitations in consistency and reliability of command execution.

Abstract

This paper investigates the possibility of intuitive human-robot interaction through the application of Natural Language Processing (NLP) and Large Language Models (LLMs) in mobile robotics. This work aims to explore the feasibility of using these technologies for edge-based deployment, where traditional cloud dependencies are eliminated. The study specifically contrasts the performance of GPT-4-Turbo, which requires cloud connectivity, with an offline-capable, quantized version of LLaMA 2 (LLaMA 2-7B.Q5 K M). These results show that GPT-4-Turbo delivers superior performance in interpreting and executing complex commands accurately, whereas LLaMA 2 exhibits significant limitations in consistency and reliability of command execution. Communication between the control computer and the mobile robot is established via a Raspberry Pi Pico W, which wirelessly receives commands from the computer without internet dependency and transmits them through a wired connection to the robot's Arduino controller. This study highlights the potential and challenges of implementing LLMs and NLP at the edge, providing groundwork for future research into fully autonomous and network-independent robotic systems. For video demonstrations and source code, please refer to: https://tinyurl.com/MobileRobotGPT4LLaMA2024.

Deployment of Large Language Models to Control Mobile Robots at the Edge

TL;DR

Results show that GPT-4-Turbo delivers superior performance in interpreting and executing complex commands accurately, whereas LLaMA 2 exhibits significant limitations in consistency and reliability of command execution.

Abstract

This paper investigates the possibility of intuitive human-robot interaction through the application of Natural Language Processing (NLP) and Large Language Models (LLMs) in mobile robotics. This work aims to explore the feasibility of using these technologies for edge-based deployment, where traditional cloud dependencies are eliminated. The study specifically contrasts the performance of GPT-4-Turbo, which requires cloud connectivity, with an offline-capable, quantized version of LLaMA 2 (LLaMA 2-7B.Q5 K M). These results show that GPT-4-Turbo delivers superior performance in interpreting and executing complex commands accurately, whereas LLaMA 2 exhibits significant limitations in consistency and reliability of command execution. Communication between the control computer and the mobile robot is established via a Raspberry Pi Pico W, which wirelessly receives commands from the computer without internet dependency and transmits them through a wired connection to the robot's Arduino controller. This study highlights the potential and challenges of implementing LLMs and NLP at the edge, providing groundwork for future research into fully autonomous and network-independent robotic systems. For video demonstrations and source code, please refer to: https://tinyurl.com/MobileRobotGPT4LLaMA2024.
Paper Structure (15 sections, 5 figures, 2 tables, 1 algorithm)

This paper contains 15 sections, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: The schematic of the integrated mobile robot system architecture showing the workflow from voice input to robot action, including offline speech recognition, language understanding modules, and task execution via Raspberry Pi Pico W and Arduino.
  • Figure 2: Experimental setup. A Dell Precision 3660 Tower serves as the computing device running Python on Ubuntu 22.04 to process speech commands captured by a Blue Yeti X USB Microphone. The processed commands are wirelessly sent to a Raspberry Pi Pico W, which relays them to the robot via a Serial connection. The robot is controlled through its modified Arduino Uno board. The whole setup demonstrates the feasibility of edge-based command execution in mobile robotics.
  • Figure 3: Diagram illustrating the fundamental motions for the wheeled mobile robot with differential steering: Motor Driver Channels (a), Moving Forward (b), Moving Backward (c), Turning Left (d), and Turning Right (e).
  • Figure 4: Calibration and Noise Filtering of Ultrasonic Sensor Data. The left graph illustrates the linear relationship between the actual distance and the ultrasonic sensor readings with a trend line, while the right graph displays the effectiveness of a noise-filtering process on ultrasonic sensor data over time. Note that, the distance of the object in front of the sensor gradually increased over time.
  • Figure 5: Some illustrative representations of the autonomous robotic response to diverse natural language commands. The sequence demonstrates the mobile robot's capability to interpret and execute commands ranging from basic directional instructions to complex navigational sequences. Each panel captures a moment in the process flow where the robot performs actions such as moving specific distances, executing precise turns, responding to ambiguous commands, and combining multiple maneuvers.