Table of Contents
Fetching ...

Jigsaw++: Imagining Complete Shape Priors for Object Reassembly

Jiaxin Lu, Gang Hua, Qixing Huang

TL;DR

Jigsaw++ tackles object reassembly by imagining complete shape priors from partial inputs. It combines a diffusion-based, category-agnostic 3D shape generator with a retargeting mechanism that aligns incomplete assemblies to the learned shape space, using rectified flow for efficient generation. A bidirectional image-to-3D mapping via LEAP enables leveraging large-scale 2D data to produce point-cloud priors, with a category-robust encoder guiding joint latent generation. Experimental results on Breaking Bad and PartNet show consistent improvements over baselines in reconstruction quality and robustness to missing pieces, illustrating the practical value of incorporating complete-shape priors as an orthogonal guidance for reassembly. The work opens avenues for integrating priors into existing assembly pipelines and for scaling to broader object categories and topologies.

Abstract

The automatic assembly problem has attracted increasing interest due to its complex challenges that involve 3D representation. This paper introduces Jigsaw++, a novel generative method designed to tackle the multifaceted challenges of reconstructing complete shape for the reassembly problem. Existing approach focusing primarily on piecewise information for both part and fracture assembly, often overlooking the integration of complete object prior. Jigsaw++ distinguishes itself by learning a shape prior of complete objects. It employs the proposed "retargeting" strategy that effectively leverages the output of any existing assembly method to generate complete shape reconstructions. This capability allows it to function orthogonally to the current methods. Through extensive evaluations on Breaking Bad dataset and PartNet, Jigsaw++ has demonstrated its effectiveness, reducing reconstruction errors and enhancing the precision of shape reconstruction, which sets a new direction for future reassembly model developments.

Jigsaw++: Imagining Complete Shape Priors for Object Reassembly

TL;DR

Jigsaw++ tackles object reassembly by imagining complete shape priors from partial inputs. It combines a diffusion-based, category-agnostic 3D shape generator with a retargeting mechanism that aligns incomplete assemblies to the learned shape space, using rectified flow for efficient generation. A bidirectional image-to-3D mapping via LEAP enables leveraging large-scale 2D data to produce point-cloud priors, with a category-robust encoder guiding joint latent generation. Experimental results on Breaking Bad and PartNet show consistent improvements over baselines in reconstruction quality and robustness to missing pieces, illustrating the practical value of incorporating complete-shape priors as an orthogonal guidance for reassembly. The work opens avenues for integrating priors into existing assembly pipelines and for scaling to broader object categories and topologies.

Abstract

The automatic assembly problem has attracted increasing interest due to its complex challenges that involve 3D representation. This paper introduces Jigsaw++, a novel generative method designed to tackle the multifaceted challenges of reconstructing complete shape for the reassembly problem. Existing approach focusing primarily on piecewise information for both part and fracture assembly, often overlooking the integration of complete object prior. Jigsaw++ distinguishes itself by learning a shape prior of complete objects. It employs the proposed "retargeting" strategy that effectively leverages the output of any existing assembly method to generate complete shape reconstructions. This capability allows it to function orthogonally to the current methods. Through extensive evaluations on Breaking Bad dataset and PartNet, Jigsaw++ has demonstrated its effectiveness, reducing reconstruction errors and enhancing the precision of shape reconstruction, which sets a new direction for future reassembly model developments.

Paper Structure

This paper contains 46 sections, 6 equations, 14 figures, 3 tables.

Figures (14)

  • Figure 1: Overview of the problem setting. The input consists of a partially assembled object represented as a point cloud. The task requires the method to reconstruct a complete object from this input. We identify several representative challenges: (a) When the object is nearly fully assembled, the output should maintain the overall shape. (b) Although all parts are visible and present, their positions are misaligned. The algorithm needs to adjust their positions correctly. (c, d) In cases where parts are incomplete or significantly misplaced, the method should not only complete the object but also correct the displacements.
  • Figure 2: Potential solutions, including point cloud completion method AdaPoinTr yu2021pointr, LION Zeng2022LIONLP VAE's reconstruction, and editing method SDEdit meng2022sdedit, direct conditional generation or inversion-then-generate without finetuning, fails in providing shape prior when given partially assembled object. A full comparison is presented in Appendix A.
  • Figure 3: Generation on image-to-3D. The point cloud (or mesh if presented) is first rendered under specific camera parameters by mapping positions to RGB space. The image-to-3D reconstruction model then encodes these rendered images into both a reconstruction latent $r$ (here shows the decoded version of $r$) and a global latent $g$. A rectified flow model is trained to jointly generate these latents. Subsequently, the generated latents are decoded, rendered, and mapped back to a point cloud.
  • Figure 4: Reconstruction and retargeting. The reconstruction involves a reverse sampling stage to convert input to a latent. The latent will be perturbed to generate a complete shape. The retargeting is to provide guidance for those latent of low likelihood in $\mathcal{N}(0,I)$.
  • Figure 5: Ablation study of Jigsaw++ with varying parameters on the Breaking Bad dataset. Top: Varies the reverse sampling steps to $N_r=kN$ to assess how well the rectified flow model accommodates step reductions. Bottom: Alter the $\alpha$ parameter in the Langevin dynamics to explore how changes in latent resampling during the retargeting phase affect model performance.
  • ...and 9 more figures