Table of Contents
Fetching ...

BoxSplitGen: A Generative Model for 3D Part Bounding Boxes in Varying Granularity

Juil Koo, Wei-Tung Lin, Chanho Park, Chanhyeok Park, Minhyuk Sung

TL;DR

A framework that enables intuitive and interactive 3D shape generation by iteratively splitting bounding boxes to refine the set of bounding boxes and demonstrates that the box-splitting generative model outperforms token prediction models and the inpainting approach with an unconditional diffusion model.

Abstract

Human creativity follows a perceptual process, moving from abstract ideas to finer details during creation. While 3D generative models have advanced dramatically, models specifically designed to assist human imagination in 3D creation -- particularly for detailing abstractions from coarse to fine -- have not been explored. We propose a framework that enables intuitive and interactive 3D shape generation by iteratively splitting bounding boxes to refine the set of bounding boxes. The main technical components of our framework are two generative models: the box-splitting generative model and the box-to-shape generative model. The first model, named BoxSplitGen, generates a collection of 3D part bounding boxes with varying granularity by iteratively splitting coarse bounding boxes. It utilizes part bounding boxes created through agglomerative merging and learns the reverse of the merging process -- the splitting sequences. The model consists of two main components: the first learns the categorical distribution of the box to be split, and the second learns the distribution of the two new boxes, given the set of boxes and the indication of which box to split. The second model, the box-to-shape generative model, is trained by leveraging the 3D shape priors learned by an existing 3D diffusion model while adapting the model to incorporate bounding box conditioning. In our experiments, we demonstrate that the box-splitting generative model outperforms token prediction models and the inpainting approach with an unconditional diffusion model. Also, we show that our box-to-shape model, based on a state-of-the-art 3D diffusion model, provides superior results compared to a previous model.

BoxSplitGen: A Generative Model for 3D Part Bounding Boxes in Varying Granularity

TL;DR

A framework that enables intuitive and interactive 3D shape generation by iteratively splitting bounding boxes to refine the set of bounding boxes and demonstrates that the box-splitting generative model outperforms token prediction models and the inpainting approach with an unconditional diffusion model.

Abstract

Human creativity follows a perceptual process, moving from abstract ideas to finer details during creation. While 3D generative models have advanced dramatically, models specifically designed to assist human imagination in 3D creation -- particularly for detailing abstractions from coarse to fine -- have not been explored. We propose a framework that enables intuitive and interactive 3D shape generation by iteratively splitting bounding boxes to refine the set of bounding boxes. The main technical components of our framework are two generative models: the box-splitting generative model and the box-to-shape generative model. The first model, named BoxSplitGen, generates a collection of 3D part bounding boxes with varying granularity by iteratively splitting coarse bounding boxes. It utilizes part bounding boxes created through agglomerative merging and learns the reverse of the merging process -- the splitting sequences. The model consists of two main components: the first learns the categorical distribution of the box to be split, and the second learns the distribution of the two new boxes, given the set of boxes and the indication of which box to split. The second model, the box-to-shape generative model, is trained by leveraging the 3D shape priors learned by an existing 3D diffusion model while adapting the model to incorporate bounding box conditioning. In our experiments, we demonstrate that the box-splitting generative model outperforms token prediction models and the inpainting approach with an unconditional diffusion model. Also, we show that our box-to-shape model, based on a state-of-the-art 3D diffusion model, provides superior results compared to a previous model.
Paper Structure (43 sections, 3 equations, 14 figures, 7 tables)

This paper contains 43 sections, 3 equations, 14 figures, 7 tables.

Figures (14)

  • Figure 1: An overview of box-splitting-based 3D shape generative framework. The left shows our iterative box splitting and box-to-shape generation, where diverse shapes at the top of the tree become increasingly specific deeper in the tree. The right showcases our user-interactive box and shape editing demo.
  • Figure 2: Overview of our hierarchical bounding box splitting framework. On the left is a binary tree for 3D shape abstraction, where red-highlighted nodes $b_v$ are split into finer child nodes, with blue and green backgrounds showing split steps at $s$ and $s+1$. On the right, the framework performs pivot classification, samples two red-highlighted child boxes, and generates 3D shapes using our Box2Shape model.
  • Figure 3: Diagram of network architectures. (a) Child-Boxes Diffusion. (b) Box2Shape. Starting from a unit cube, we iteratively split boxes using Child-Boxes Diffusion to obtain the box condition with desired granularity, which then guides Box2Shape to generate aligned 3D shapes.
  • Figure 4: Qualitative comparison of shape abstraction generation. For each pair of columns, we query the ground truth shape and retrieve the closest generated boxes measured with chamfer distance. Our method demonstrates higher-fidelity boxes.
  • Figure 5: Gallery of our generated bounding boxes and their final generated 3D shapes by box-conditioned shape generation models. Each pair of columns shows the input bounding boxes (left) and their corresponding generated 3D shapes (right).
  • ...and 9 more figures