Table of Contents
Fetching ...

MatchMaker: Automated Asset Generation for Robotic Assembly

Yian Wang, Bingjie Tang, Chuang Gan, Dieter Fox, Kaichun Mo, Yashraj Narang, Iretiayo Akinola

TL;DR

MatchMaker addresses the data bottleneck in robotic assembly by automatically generating diverse, simulatable paired assets for single-axis tasks. It combines a vision-language model to identify assembly axes, a diffusion-based shape completion process that preserves contact surfaces, and a clearance specification step to ensure penetration-free interactions. The approach yields richer asset diversity and enables effective policy learning in simulation, with successful transfer to real-world 3D-printed parts demonstrating practical impact. This scalable asset-generation framework paves the way for broader generalization of assembly skills across varied geometries and tasks.

Abstract

Robotic assembly remains a significant challenge due to complexities in visual perception, functional grasping, contact-rich manipulation, and performing high-precision tasks. Simulation-based learning and sim-to-real transfer have led to recent success in solving assembly tasks in the presence of object pose variation, perception noise, and control error; however, the development of a generalist (i.e., multi-task) agent for a broad range of assembly tasks has been limited by the need to manually curate assembly assets, which greatly constrains the number and diversity of assembly problems that can be used for policy learning. Inspired by recent success of using generative AI to scale up robot learning, we propose MatchMaker, a pipeline to automatically generate diverse, simulation-compatible assembly asset pairs to facilitate learning assembly skills. Specifically, MatchMaker can 1) take a simulation-incompatible, interpenetrating asset pair as input, and automatically convert it into a simulation-compatible, interpenetration-free pair, 2) take an arbitrary single asset as input, and generate a geometrically-mating asset to create an asset pair, 3) automatically erode contact surfaces from (1) or (2) according to a user-specified clearance parameter to generate realistic parts. We demonstrate that data generated by MatchMaker outperforms previous work in terms of diversity and effectiveness for downstream assembly skill learning. For videos and additional details, please see our project website: https://wangyian-me.github.io/MatchMaker/.

MatchMaker: Automated Asset Generation for Robotic Assembly

TL;DR

MatchMaker addresses the data bottleneck in robotic assembly by automatically generating diverse, simulatable paired assets for single-axis tasks. It combines a vision-language model to identify assembly axes, a diffusion-based shape completion process that preserves contact surfaces, and a clearance specification step to ensure penetration-free interactions. The approach yields richer asset diversity and enables effective policy learning in simulation, with successful transfer to real-world 3D-printed parts demonstrating practical impact. This scalable asset-generation framework paves the way for broader generalization of assembly skills across varied geometries and tasks.

Abstract

Robotic assembly remains a significant challenge due to complexities in visual perception, functional grasping, contact-rich manipulation, and performing high-precision tasks. Simulation-based learning and sim-to-real transfer have led to recent success in solving assembly tasks in the presence of object pose variation, perception noise, and control error; however, the development of a generalist (i.e., multi-task) agent for a broad range of assembly tasks has been limited by the need to manually curate assembly assets, which greatly constrains the number and diversity of assembly problems that can be used for policy learning. Inspired by recent success of using generative AI to scale up robot learning, we propose MatchMaker, a pipeline to automatically generate diverse, simulation-compatible assembly asset pairs to facilitate learning assembly skills. Specifically, MatchMaker can 1) take a simulation-incompatible, interpenetrating asset pair as input, and automatically convert it into a simulation-compatible, interpenetration-free pair, 2) take an arbitrary single asset as input, and generate a geometrically-mating asset to create an asset pair, 3) automatically erode contact surfaces from (1) or (2) according to a user-specified clearance parameter to generate realistic parts. We demonstrate that data generated by MatchMaker outperforms previous work in terms of diversity and effectiveness for downstream assembly skill learning. For videos and additional details, please see our project website: https://wangyian-me.github.io/MatchMaker/.

Paper Structure

This paper contains 18 sections, 5 figures, 3 tables.

Figures (5)

  • Figure 2: Contact-Surface Extraction: (a) A Vision-Language Model (VLM), such as GPT-4o, is employed to identify the object and predict potential insertion directions. (b) An analytical method is then applied to extract the contact surfaces based on the object's geometry and results from (a).
  • Figure 3: Samples of generated results. The green parts were sampled from the ABC dataset koch2019abc, while the blue parts were generated. Asset pairs are labeled with unique identifiers (UIDs) for easy reference.
  • Figure 4: Policy-learning results of automatically post-processed assets (from MatchMaker) and manually post-processed assets (from AutoMate tang2024automate). For each, we select the best policy checkpoint and evaluate it across 3,000 trials.
  • Figure 5: Policy-learning results of generated assets.
  • Figure 6: Key frames of the assembly process for each 3D-printed asset pairs, with the success rates over 10 trials.