Table of Contents
Fetching ...

Optimal Robotic Assembly Sequence Planning: A Sequential Decision-Making Approach

Kartik Nagpal, Negar Mehr

TL;DR

This work reframes robotic assembly sequencing as an optimal control problem and reduces it to a shortest-path problem on a consolidated state-action graph to enable scalable planning. It introduces Graph Exploration Assembly Planners (GEAPs), the ORASP on-demand search, and a Deep Q Network (DQN) approach to handle large, nonlinear objective functions and constraints such as sequential precedence and unconnected parts. The framework achieves substantial speedups over state-of-the-art ILP in moderate settings and demonstrates near-optimal or superior performance on very large structures, with publicly available code. Together, these methods advance scalable, constraint-aware planning for multi-robot assembly with broad practical implications for industrial and space applications.

Abstract

The optimal robot assembly planning problem is challenging due to the necessity of finding the optimal solution amongst an exponentially vast number of possible plans, all while satisfying a selection of constraints. Traditionally, robotic assembly planning problems have been solved using heuristics, but these methods are specific to a given objective structure or set of problem parameters. In this paper, we propose a novel approach to robotic assembly planning that poses assembly sequencing as a sequential decision making problem, enabling us to harness methods that far outperform the state-of-the-art. We formulate the problem as a Markov Decision Process (MDP) and utilize Dynamic Programming (DP) to find optimal assembly policies for moderately sized strictures. We further expand our framework to exploit the deterministic nature of assembly planning and introduce a class of optimal Graph Exploration Assembly Planners (GEAPs). For larger structures, we show how Reinforcement Learning (RL) enables us to learn policies that generate high reward assembly sequences. We evaluate our approach on a variety of robotic assembly problems, such as the assembly of the Hubble Space Telescope, the International Space Station, and the James Webb Space Telescope. We further showcase how our DP, GEAP, and RL implementations are capable of finding optimal solutions under a variety of different objective functions and how our formulation allows us to translate precedence constraints to branch pruning and thus further improve performance. We have published our code at https://github.com/labicon/ORASP-Code.

Optimal Robotic Assembly Sequence Planning: A Sequential Decision-Making Approach

TL;DR

This work reframes robotic assembly sequencing as an optimal control problem and reduces it to a shortest-path problem on a consolidated state-action graph to enable scalable planning. It introduces Graph Exploration Assembly Planners (GEAPs), the ORASP on-demand search, and a Deep Q Network (DQN) approach to handle large, nonlinear objective functions and constraints such as sequential precedence and unconnected parts. The framework achieves substantial speedups over state-of-the-art ILP in moderate settings and demonstrates near-optimal or superior performance on very large structures, with publicly available code. Together, these methods advance scalable, constraint-aware planning for multi-robot assembly with broad practical implications for industrial and space applications.

Abstract

The optimal robot assembly planning problem is challenging due to the necessity of finding the optimal solution amongst an exponentially vast number of possible plans, all while satisfying a selection of constraints. Traditionally, robotic assembly planning problems have been solved using heuristics, but these methods are specific to a given objective structure or set of problem parameters. In this paper, we propose a novel approach to robotic assembly planning that poses assembly sequencing as a sequential decision making problem, enabling us to harness methods that far outperform the state-of-the-art. We formulate the problem as a Markov Decision Process (MDP) and utilize Dynamic Programming (DP) to find optimal assembly policies for moderately sized strictures. We further expand our framework to exploit the deterministic nature of assembly planning and introduce a class of optimal Graph Exploration Assembly Planners (GEAPs). For larger structures, we show how Reinforcement Learning (RL) enables us to learn policies that generate high reward assembly sequences. We evaluate our approach on a variety of robotic assembly problems, such as the assembly of the Hubble Space Telescope, the International Space Station, and the James Webb Space Telescope. We further showcase how our DP, GEAP, and RL implementations are capable of finding optimal solutions under a variety of different objective functions and how our formulation allows us to translate precedence constraints to branch pruning and thus further improve performance. We have published our code at https://github.com/labicon/ORASP-Code.
Paper Structure (10 sections, 3 equations, 10 figures, 1 table, 1 algorithm)

This paper contains 10 sections, 3 equations, 10 figures, 1 table, 1 algorithm.

Figures (10)

  • Figure 1: Our "4Brick" structure of (\ref{['fig:Ex']}) is translated to its graph representation $\mathcal{G}$ in (\ref{['fig:G']}). We use this toy example repeatedly over the course of the paper.
  • Figure 2: The original structures, their corresponding graph representations, and their reward distribution comparisons between our DQN Approach and a distribution of 100 randomly sampled sequences. Our proposed DQN structure recovers the optimal solution for the Hubble, ISS, and Furniture scenarios. The JWST model is too large to verify for optimality with its 14 quinvigintillion potential subassemblies, but our DQN still outperforms the entire distribution of sequences. Note: Each label is followed by the tuple $(N,E)$ where $N$ is the number of parts in the structure and $E$ is the number of connections.
  • Figure 3: Illustration of how we exploit the isomorphic nature of our state definition to "consolidate" our state-action space for the "4Brick" scenario. Note the color coordination between arrow colors and the connections removed at each step, which is in line with Fig. \ref{['fig:ExampleConversion']}(\ref{['fig:G']}).
  • Figure 4: The rate at which the number of states grows compared to the number of connections in the fully assembled structure. This is equivalent to the growth rate of the number of nodes in the consolidated graph $\mathcal{H}$. Note the y-axis of these plots are on a logarithmic scale.
  • Figure 5: Graph representations for the structures discussed in Table \ref{['table:res']}. Each label is followed by the tuple $(N,E)$ where $N$ is the number of parts in the structure and $E$ is the number of connections in the structure.
  • ...and 5 more figures