Table of Contents
Fetching ...

Enhancing Human-Robot Collaborative Assembly in Manufacturing Systems Using Large Language Models

Jonghan Lim, Sujani Patel, Alex Evans, John Pimley, Yifei Li, Ilya Kovalenko

TL;DR

This work addresses the communication bottleneck in human–robot collaborative manufacturing by proposing an LLM-driven framework that translates natural-language voice commands into robot actions via a two-layer architecture: a physical layer handling workspace data and an event-driven NL alert system, and a virtual layer with a human command interpreter and a robot task executor. The system leverages GPT-4 function calling, OpenAI speech tools (Whisper for transcription and TTS for responses), and a vision stage (Yolov5) to perform assembly tasks with real-time error handling and re-entry from the last completed subtask $t_{i\text{c}}$. A cable shark assembly case study demonstrates end-to-end NL-guided coordination, achieving high success when instructions are specific and revealing weaknesses with vague commands, highlighting the need for clearer initialization and potential knowledge-base support. The results indicate that LLMs can enhance human–robot interaction in collaborative manufacturing by enabling adaptive responses to language variation and environmental changes, with practical implications for reducing robotics training needs and improving assembly efficiency. $T$ denotes the task set, $C$ the robot capabilities, ${t_{i1},...,t_{ik}}$ the subtasks, and $t_{i\text{c}}$ the last completed subtask, while $M_{e_i}(t_i)$ and $M_c(t_i)$ represent error and completion messages, respectively.

Abstract

The development of human-robot collaboration has the ability to improve manufacturing system performance by leveraging the unique strengths of both humans and robots. On the shop floor, human operators contribute with their adaptability and flexibility in dynamic situations, while robots provide precision and the ability to perform repetitive tasks. However, the communication gap between human operators and robots limits the collaboration and coordination of human-robot teams in manufacturing systems. Our research presents a human-robot collaborative assembly framework that utilizes a large language model for enhancing communication in manufacturing environments. The framework facilitates human-robot communication by integrating voice commands through natural language for task management. A case study for an assembly task demonstrates the framework's ability to process natural language inputs and address real-time assembly challenges, emphasizing adaptability to language variation and efficiency in error resolution. The results suggest that large language models have the potential to improve human-robot interaction for collaborative manufacturing assembly applications.

Enhancing Human-Robot Collaborative Assembly in Manufacturing Systems Using Large Language Models

TL;DR

This work addresses the communication bottleneck in human–robot collaborative manufacturing by proposing an LLM-driven framework that translates natural-language voice commands into robot actions via a two-layer architecture: a physical layer handling workspace data and an event-driven NL alert system, and a virtual layer with a human command interpreter and a robot task executor. The system leverages GPT-4 function calling, OpenAI speech tools (Whisper for transcription and TTS for responses), and a vision stage (Yolov5) to perform assembly tasks with real-time error handling and re-entry from the last completed subtask . A cable shark assembly case study demonstrates end-to-end NL-guided coordination, achieving high success when instructions are specific and revealing weaknesses with vague commands, highlighting the need for clearer initialization and potential knowledge-base support. The results indicate that LLMs can enhance human–robot interaction in collaborative manufacturing by enabling adaptive responses to language variation and environmental changes, with practical implications for reducing robotics training needs and improving assembly efficiency. denotes the task set, the robot capabilities, the subtasks, and the last completed subtask, while and represent error and completion messages, respectively.

Abstract

The development of human-robot collaboration has the ability to improve manufacturing system performance by leveraging the unique strengths of both humans and robots. On the shop floor, human operators contribute with their adaptability and flexibility in dynamic situations, while robots provide precision and the ability to perform repetitive tasks. However, the communication gap between human operators and robots limits the collaboration and coordination of human-robot teams in manufacturing systems. Our research presents a human-robot collaborative assembly framework that utilizes a large language model for enhancing communication in manufacturing environments. The framework facilitates human-robot communication by integrating voice commands through natural language for task management. A case study for an assembly task demonstrates the framework's ability to process natural language inputs and address real-time assembly challenges, emphasizing adaptability to language variation and efficiency in error resolution. The results suggest that large language models have the potential to improve human-robot interaction for collaborative manufacturing assembly applications.
Paper Structure (17 sections, 6 figures, 2 tables)

This paper contains 17 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Human-Robot Collaborative Assembly Framework Using
  • Figure 2: Sequence Diagram for Human-Robot Collaborative Assembly in Manufacturing Systems
  • Figure 3: Cable Shark Assembly
  • Figure 4: Feature Extraction Method with the Vision System
  • Figure 5: Cable Shark Assembly Process
  • ...and 1 more figures