Table of Contents
Fetching ...

GenVDM: Generating Vector Displacement Maps From a Single Image

Yuezhi Yang, Qimin Chen, Vladimir G. Kim, Siddhartha Chaudhuri, Qixing Huang, Zhiqin Chen

TL;DR

GenVDM introduces a novel pipeline to generate Vector Displacement Maps (VDMs) from a single image, addressing the need for controllable geometric detail in 3D modeling. It first produces multi-view normal maps using a fine-tuned diffusion-based model, then reconstructs a 3D shape via a neural SDF followed by parameterization to a VDM through a neural deformation field on a square domain. To train and evaluate, the authors build a 1,200-patch VDM dataset from Objaverse and demonstrate that their method outperforms baselines on perceptual and semantic metrics, while enabling practical applications in shape modeling and part editing. This work enables artists to generate, customize, and attach detailed geometric stamps to existing meshes, bridging 2D image editing with 3D surface detail. The dataset and pipeline pave the way for data-efficient VDM generation and broader adoption of part-based geometric detailing in creative workflows.

Abstract

We introduce the first method for generating Vector Displacement Maps (VDMs): parameterized, detailed geometric stamps commonly used in 3D modeling. Given a single input image, our method first generates multi-view normal maps and then reconstructs a VDM from the normals via a novel reconstruction pipeline. We also propose an efficient algorithm for extracting VDMs from 3D objects, and present the first academic VDM dataset. Compared to existing 3D generative models focusing on complete shapes, we focus on generating parts that can be seamlessly attached to shape surfaces. The method gives artists rich control over adding geometric details to a 3D shape. Experiments demonstrate that our approach outperforms existing baselines. Generating VDMs offers additional benefits, such as using 2D image editing to customize and refine 3D details.

GenVDM: Generating Vector Displacement Maps From a Single Image

TL;DR

GenVDM introduces a novel pipeline to generate Vector Displacement Maps (VDMs) from a single image, addressing the need for controllable geometric detail in 3D modeling. It first produces multi-view normal maps using a fine-tuned diffusion-based model, then reconstructs a 3D shape via a neural SDF followed by parameterization to a VDM through a neural deformation field on a square domain. To train and evaluate, the authors build a 1,200-patch VDM dataset from Objaverse and demonstrate that their method outperforms baselines on perceptual and semantic metrics, while enabling practical applications in shape modeling and part editing. This work enables artists to generate, customize, and attach detailed geometric stamps to existing meshes, bridging 2D image editing with 3D surface detail. The dataset and pipeline pave the way for data-efficient VDM generation and broader adoption of part-based geometric detailing in creative workflows.

Abstract

We introduce the first method for generating Vector Displacement Maps (VDMs): parameterized, detailed geometric stamps commonly used in 3D modeling. Given a single input image, our method first generates multi-view normal maps and then reconstructs a VDM from the normals via a novel reconstruction pipeline. We also propose an efficient algorithm for extracting VDMs from 3D objects, and present the first academic VDM dataset. Compared to existing 3D generative models focusing on complete shapes, we focus on generating parts that can be seamlessly attached to shape surfaces. The method gives artists rich control over adding geometric details to a 3D shape. Experiments demonstrate that our approach outperforms existing baselines. Generating VDMs offers additional benefits, such as using 2D image editing to customize and refine 3D details.

Paper Structure

This paper contains 24 sections, 3 equations, 12 figures, 3 tables.

Figures (12)

  • Figure 1: We introduce GenVDM, a method that can generate a highly detailed Vector Displacement Map (VDM) from a single input image. The generated VDMs can be directly applied to mesh surfaces to create intricate geometric details. Note that the thumbnails represent plain 2D RGB image sources.
  • Figure 2: Overview of our image-to-VDM pipeline. Given an input image, we first add a gray square behind the object/part in the image as background, so the image resembles a textured VDM applied to a square mesh, as in (a). Then we utilize a multi-view image diffusion model to generate six normal maps with pre-defined camera poses, as in (b). The multi-view normal maps effectively represent the geometry of the VDM when applied to a square mesh, and thus we can reconstruct the VDM from these normal maps, as in (c). The reconstructed VDM can then be applied to various surfaces as in (d).
  • Figure 3: Reconstructing VDM from multi-view normal maps. We adopt a two-step approach. First, we reconstruct an accurate (but perhaps noisy) mesh (b) from the multi-view normals (a) with differentiable rendering and neural SDF representation. Then we parameterize the mesh by fitting a deformable square to it with a neural deformation field, as in (c). An VDM image can thus be obtained by discretizing the square into pixels and infer each pixel's displacement from the neural deformation field. The whole reconstruction pipeline takes about 6 minutes for each shape on an NVIDIA A100 GPU, where each step takes about 3 minutes.
  • Figure 4: Comparison of different approaches for parameterizing a shape into VDM. (a) Topology fixing and Tutte embedding with classic tools leads to noise and distortion. (b) Fitting a plane mesh to the target mesh leads to large distortion. (c) Our approach by applying a neural deformation field to a parametric square leads to clean and high-quality reconstruction.
  • Figure 5: Data preparation. For each interesting object (a), we use a 3D lasso tool to segment out interesting parts. For each part, we densely sample points on the part's surface and then perform Screened Poisson Surface Reconstruction Screened_poisson to obtain a single connected mesh (b). We then stitch the mesh to a square mesh with an algorithm inspired by Poisson Image Editing Poisson_image_editing (c). Afterwards, we can color the part and render RGB images (d) and normal maps (e) for training the image diffusion model.
  • ...and 7 more figures