Optimal Robotic Assembly Sequence Planning: A Sequential Decision-Making Approach
Kartik Nagpal, Negar Mehr
TL;DR
This work reframes robotic assembly sequencing as an optimal control problem and reduces it to a shortest-path problem on a consolidated state-action graph to enable scalable planning. It introduces Graph Exploration Assembly Planners (GEAPs), the ORASP on-demand search, and a Deep Q Network (DQN) approach to handle large, nonlinear objective functions and constraints such as sequential precedence and unconnected parts. The framework achieves substantial speedups over state-of-the-art ILP in moderate settings and demonstrates near-optimal or superior performance on very large structures, with publicly available code. Together, these methods advance scalable, constraint-aware planning for multi-robot assembly with broad practical implications for industrial and space applications.
Abstract
The optimal robot assembly planning problem is challenging due to the necessity of finding the optimal solution amongst an exponentially vast number of possible plans, all while satisfying a selection of constraints. Traditionally, robotic assembly planning problems have been solved using heuristics, but these methods are specific to a given objective structure or set of problem parameters. In this paper, we propose a novel approach to robotic assembly planning that poses assembly sequencing as a sequential decision making problem, enabling us to harness methods that far outperform the state-of-the-art. We formulate the problem as a Markov Decision Process (MDP) and utilize Dynamic Programming (DP) to find optimal assembly policies for moderately sized strictures. We further expand our framework to exploit the deterministic nature of assembly planning and introduce a class of optimal Graph Exploration Assembly Planners (GEAPs). For larger structures, we show how Reinforcement Learning (RL) enables us to learn policies that generate high reward assembly sequences. We evaluate our approach on a variety of robotic assembly problems, such as the assembly of the Hubble Space Telescope, the International Space Station, and the James Webb Space Telescope. We further showcase how our DP, GEAP, and RL implementations are capable of finding optimal solutions under a variety of different objective functions and how our formulation allows us to translate precedence constraints to branch pruning and thus further improve performance. We have published our code at https://github.com/labicon/ORASP-Code.
