Table of Contents
Fetching ...

DECOLLAGE: 3D Detailization by Controllable, Localized, and Learned Geometry Enhancement

Qimin Chen, Zhiqin Chen, Vladimir G. Kim, Noam Aigerman, Hao Zhang, Siddhartha Chaudhuri

TL;DR

DECOLLAGE tackles the challenge of adding high-quality geometric details to coarse 3D shapes in a controllable, region-specific manner. It introduces a Pyramid GAN that conditions multi-scale upsampling on per-voxel style codes and part labels, guided by global and style-specific discriminators, and reinforced by structure-preserving and reconstruction losses, plus adaptive $\alpha$ weighting at style boundaries. The approach enables novel interactive workflows and robust cross-category style mixing, outperforming prior global-detailization methods in connectivity, structural fidelity, and local detail plausibility. Training relies on data augmentation from a small set of segmented style shapes, and applications demonstrate interactive detailization across diverse inputs with fast inference. The work advances interactive 3D content creation by enabling localized, exemplar-driven geometry enhancement with coherent transitions across regions and scales, while outlining future directions in diffusion-based voxel upsampling and integration with multimodal models.

Abstract

We present a 3D modeling method which enables end-users to refine or detailize 3D shapes using machine learning, expanding the capabilities of AI-assisted 3D content creation. Given a coarse voxel shape (e.g., one produced with a simple box extrusion tool or via generative modeling), a user can directly "paint" desired target styles representing compelling geometric details, from input exemplar shapes, over different regions of the coarse shape. These regions are then up-sampled into high-resolution geometries which adhere with the painted styles. To achieve such controllable and localized 3D detailization, we build on top of a Pyramid GAN by making it masking-aware. We devise novel structural losses and priors to ensure that our method preserves both desired coarse structures and fine-grained features even if the painted styles are borrowed from diverse sources, e.g., different semantic parts and even different shape categories. Through extensive experiments, we show that our ability to localize details enables novel interactive creative workflows and applications. Our experiments further demonstrate that in comparison to prior techniques built on global detailization, our method generates structure-preserving, high-resolution stylized geometries with more coherent shape details and style transitions.

DECOLLAGE: 3D Detailization by Controllable, Localized, and Learned Geometry Enhancement

TL;DR

DECOLLAGE tackles the challenge of adding high-quality geometric details to coarse 3D shapes in a controllable, region-specific manner. It introduces a Pyramid GAN that conditions multi-scale upsampling on per-voxel style codes and part labels, guided by global and style-specific discriminators, and reinforced by structure-preserving and reconstruction losses, plus adaptive weighting at style boundaries. The approach enables novel interactive workflows and robust cross-category style mixing, outperforming prior global-detailization methods in connectivity, structural fidelity, and local detail plausibility. Training relies on data augmentation from a small set of segmented style shapes, and applications demonstrate interactive detailization across diverse inputs with fast inference. The work advances interactive 3D content creation by enabling localized, exemplar-driven geometry enhancement with coherent transitions across regions and scales, while outlining future directions in diffusion-based voxel upsampling and integration with multimodal models.

Abstract

We present a 3D modeling method which enables end-users to refine or detailize 3D shapes using machine learning, expanding the capabilities of AI-assisted 3D content creation. Given a coarse voxel shape (e.g., one produced with a simple box extrusion tool or via generative modeling), a user can directly "paint" desired target styles representing compelling geometric details, from input exemplar shapes, over different regions of the coarse shape. These regions are then up-sampled into high-resolution geometries which adhere with the painted styles. To achieve such controllable and localized 3D detailization, we build on top of a Pyramid GAN by making it masking-aware. We devise novel structural losses and priors to ensure that our method preserves both desired coarse structures and fine-grained features even if the painted styles are borrowed from diverse sources, e.g., different semantic parts and even different shape categories. Through extensive experiments, we show that our ability to localize details enables novel interactive creative workflows and applications. Our experiments further demonstrate that in comparison to prior techniques built on global detailization, our method generates structure-preserving, high-resolution stylized geometries with more coherent shape details and style transitions.
Paper Structure (20 sections, 8 equations, 20 figures, 6 tables)

This paper contains 20 sections, 8 equations, 20 figures, 6 tables.

Figures (20)

  • Figure 1: Décollage is an art form created by "cutting/removing pieces of an original image"1. When "painting" a style exemplar with geometric details over a region of a coarse shape, coarse surfaces are removed to unveil a detailized version to mimic the exemplar. We show an out-of-distribution chair-like shape detailized via style mixing, where five exemplars "décollaged" the coarse voxels.
  • Figure 2: DECOR-GAN$^{*}$chen2021decor (a) with naïve local controllability generates disconnected structures and floating pieces. Our method DECOLLAGE (b) fares much better in preserving global structure and generating local geometric details.
  • Figure 3: Application: detailizing shapes from various sources including (a) extruding 2D profiles; (b) coarse voxels created via an interactive user interface (see supplementary); (c) shapes generated by text-to-3D model; (d) simple CAD primitives.
  • Figure 4: Network architecture. Conditioned on a set of style codes associated with each segmented part, the network upsamples the coarse content voxel with part labels into detailed geometries in multiple resolutions. For each upsampling level $j$, the discriminator enforces the local patches of each part in the upsampled geometry to be plausible with respect to the styles they are conditioned on. The structure-preserving losses $\mathcal{L}_{down}^{j}$ and $\mathcal{L}_{up}^{j}$ enforce the structure of the output to be consistent with the input.
  • Figure 5: Augmentation examples of different categories. For each category, we show the original style shape in the first row, the corresponding augmented style shape in the second row left, and downsampled as coarse shapes for training in the second row right.
  • ...and 15 more figures