Stackelberg Game-Theoretic Learning for Collaborative Assembly Task Planning

Yuhan Zhao; Lan Shi; Quanyan Zhu

Stackelberg Game-Theoretic Learning for Collaborative Assembly Task Planning

Yuhan Zhao, Lan Shi, Quanyan Zhu

TL;DR

The paper tackles scalable coordination for two heterogeneous robots in collaborative assembly, where centralized planning struggles with increasing task diversity. It formulates a stochastic Stackelberg game to capture leader–follower interactions and introduces Stackelberg double deep Q-learning to learn equilibrium strategies for both robots. Through simulations on eight assembly tasks, the method outperforms independent Q-learning, Nash Q-learning, and MADDPG, achieving faster completion times and robustness to disturbances. The approach leverages a chessboard representation to decompose tasks and reduce learning complexity, enabling automated, robust, and efficient multi-robot collaboration for customized assembly scenarios.

Abstract

As assembly tasks grow in complexity, collaboration among multiple robots becomes essential for task completion. However, centralized task planning has become inadequate for adapting to the increasing intelligence and versatility of robots, along with rising customized orders. There is a need for efficient and automated planning mechanisms capable of coordinating diverse robots for collaborative assembly. To this end, we propose a Stackelberg game-theoretic learning approach. By leveraging Stackelberg games, we characterize robot collaboration through leader-follower interaction to enhance strategy seeking and ensure task completion. To enhance applicability across tasks, we introduce a novel multi-agent learning algorithm: Stackelberg double deep Q-learning, which facilitates automated assembly strategy seeking and multi-robot coordination. Our approach is validated through simulated assembly tasks. Comparison with three alternative multi-agent learning methods shows that our approach achieves the shortest task completion time for tasks. Furthermore, our approach exhibits robustness against both accidental and deliberate environmental perturbations.

Stackelberg Game-Theoretic Learning for Collaborative Assembly Task Planning

TL;DR

Abstract

Paper Structure (23 sections, 1 theorem, 14 equations, 11 figures, 4 tables)

This paper contains 23 sections, 1 theorem, 14 equations, 11 figures, 4 tables.

Introduction
Related Work
Stackelberg Games in Collaborative Manufacturing
Reinforcement Learning in Collaborative Manufacturing
Learning in Stackelberg Games
Collaborative Assembly Task Description
Task Decomposition
Task Planning and Chessboard Representation
Stackelberg Game-Theoretic Learning for Collaborative Assembly
Stochastic Stackelberg Game Framework
Game Specification for Collaborative Tasks
Stackelberg Q-Functions
Double Deep Q-Network for Stackelberg Learning
Comparison with MARL
Learning Architecture and Communication Between Agents
...and 8 more sections

Key Result

proposition thmcounterproposition

Let $\bm{\pi}^* := \langle \pi^{L*}, \pi^{F*} \rangle$ be a SE of the game eq:sg. Then $\bm{\pi}^*$ is also a SE of the bimatrix game $\langle Q^L_{\bm{\pi}^*}(s,\cdot), Q^F_{\bm{\pi}^*}(s,\cdot) \rangle$ for all $s \in \mathcal{S}$. Conversely, a SE $\bm{\pi}^*$ of the bimatrix game $\langle Q^L_{\

Figures (11)

Figure 1: Illustration of Stackelberg Learning framework for collaborative assembly tasks. The real assembly tasks (part 1) are abstracted and decomposed using chessboard representation (part 2) and are then used as virtual tasks for Stackelberg learning between robots (part 3). Robots leverage the learned strategies to collaborate and complete assembly tasks (part 4). Assembly task figures are adapted from suarez2018cankunic2021cyber.
Figure 2: The bracket assembly task example. Two robots collaboratively install bolts B1-B8 to connect the bracket parts.
Figure 3: [Left] directed graph representation of the bracket assembly task. [Right] chessboard representation of the task. Each node (resp. block) represents a sub-task. Different colors denote sub-task types. The last row of the chessboard contains available sub-tasks. The same adjacent sub-tasks in the chessboard are merged for compact representation.
Figure 4: Collaborative task planning for the bracket assembly task in three interaction rounds.
Figure 5: Plots of Assembly Tasks 2-4.
...and 6 more figures

Theorems & Definitions (7)

definition thmcounterdefinition
remark thmcounterremark
remark thmcounterremark
proposition thmcounterproposition
proof
remark thmcounterremark
definition thmcounterdefinition

Stackelberg Game-Theoretic Learning for Collaborative Assembly Task Planning

TL;DR

Abstract

Stackelberg Game-Theoretic Learning for Collaborative Assembly Task Planning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (7)