Table of Contents
Fetching ...

Learning to Build: Autonomous Robotic Assembly of Stable Structures Without Predefined Plans

Jingwen Wang, Johannes Kirschner, Paul Rolland, Luis Salamanca, Stefana Parascho

TL;DR

A novel autonomous robotic assembly framework for constructing stable structures without relying on predefined architectural blueprints, where construction tasks are defined through targets and obstacles, allowing the system to adapt more flexibly to environmental uncertainty and variations during the building process.

Abstract

This paper presents a novel autonomous robotic assembly framework for constructing stable structures without relying on predefined architectural blueprints. Instead of following fixed plans, construction tasks are defined through targets and obstacles, allowing the system to adapt more flexibly to environmental uncertainty and variations during the building process. A reinforcement learning (RL) policy, trained using deep Q-learning with successor features, serves as the decision-making component. As a proof of concept, we evaluate the approach on a benchmark of 15 2D robotic assembly tasks of discrete block construction. Experiments using a real-world closed-loop robotic setup demonstrate the feasibility of the method and its ability to handle construction noise. The results suggest that our framework offers a promising direction for more adaptable and robust robotic construction in real-world environments.

Learning to Build: Autonomous Robotic Assembly of Stable Structures Without Predefined Plans

TL;DR

A novel autonomous robotic assembly framework for constructing stable structures without relying on predefined architectural blueprints, where construction tasks are defined through targets and obstacles, allowing the system to adapt more flexibly to environmental uncertainty and variations during the building process.

Abstract

This paper presents a novel autonomous robotic assembly framework for constructing stable structures without relying on predefined architectural blueprints. Instead of following fixed plans, construction tasks are defined through targets and obstacles, allowing the system to adapt more flexibly to environmental uncertainty and variations during the building process. A reinforcement learning (RL) policy, trained using deep Q-learning with successor features, serves as the decision-making component. As a proof of concept, we evaluate the approach on a benchmark of 15 2D robotic assembly tasks of discrete block construction. Experiments using a real-world closed-loop robotic setup demonstrate the feasibility of the method and its ability to handle construction noise. The results suggest that our framework offers a promising direction for more adaptable and robust robotic construction in real-world environments.
Paper Structure (27 sections, 7 figures, 1 table, 1 algorithm)

This paper contains 27 sections, 7 figures, 1 table, 1 algorithm.

Figures (7)

  • Figure 1: Left: Problem Definition: Example of a construction task, defined by the construction space, target locations, obstacle regions, and available unit blocks. Right: Tasks: A set of 15 tasks used to evaluate the reinforcement learning approach.
  • Figure 2: Example of episode run of Task 8.
  • Figure 3: Closed-loop robotic assembly workflow. Given the current constructed structure (state $S$) and the task specification, the trained policy selects the next action $A$ for the robot to execute. After the robot places the block in the physical world, ArUco-based pose estimation is used to detect the updated block configuration. This information is then fed back into the simulation to update the state $S$ for the next decision step.
  • Figure 4: Closed-loop robotic assembly setup: Left: Real-world construction scene with the robotic arm. Right Top: 3D-printed blocks labeled with ArUco markers for visual tracking. Right Bottom: Custom L-shaped suction gripper used for block manipulation.
  • Figure 5: The plots show the total number of solved tasks, average cumulative reward and average number of blocks per episode across the 15 tasks. Note that already after 10 episodes, the policy solves 13 out of the 15 tasks, and achieves an average reward of $~0.75$. In episode 37, the policy solves all 15 tasks. In the final episode, the policy solves 14 out of 15 tasks and achieves a reward $>0.75$. The right plot shows how the policy learns to build structures that reach the task targets using fewer blocks as the training progresses.
  • ...and 2 more figures