Table of Contents
Fetching ...

MagicClay: Sculpting Meshes With Generative Neural Fields

Amir Barda, Vladimir G. Kim, Noam Aigerman, Amit H. Bermano, Thibault Groueix

TL;DR

A hybrid approach that maintains both a mesh and a Signed Distance Field (SDF) representations consistently is introduced, and MagicClay — a tool for sculpting regions of a mesh according to textual prompts while keeping other regions untouched is introduced.

Abstract

The recent developments in neural fields have brought phenomenal capabilities to the field of shape generation, but they lack crucial properties, such as incremental control - a fundamental requirement for artistic work. Triangular meshes, on the other hand, are the representation of choice for most geometry related tasks, offering efficiency and intuitive control, but do not lend themselves to neural optimization. To support downstream tasks, previous art typically proposes a two-step approach, where first a shape is generated using neural fields, and then a mesh is extracted for further processing. Instead, in this paper we introduce a hybrid approach that maintains both a mesh and a Signed Distance Field (SDF) representations consistently. Using this representation, we introduce MagicClay - an artist friendly tool for sculpting regions of a mesh according to textual prompts while keeping other regions untouched. Our framework carefully and efficiently balances consistency between the representations and regularizations in every step of the shape optimization; Relying on the mesh representation, we show how to render the SDF at higher resolutions and faster. In addition, we employ recent work in differentiable mesh reconstruction to adaptively allocate triangles in the mesh where required, as indicated by the SDF. Using an implemented prototype, we demonstrate superior generated geometry compared to the state-of-the-art, and novel consistent control, allowing sequential prompt-based edits to the same mesh for the first time.

MagicClay: Sculpting Meshes With Generative Neural Fields

TL;DR

A hybrid approach that maintains both a mesh and a Signed Distance Field (SDF) representations consistently is introduced, and MagicClay — a tool for sculpting regions of a mesh according to textual prompts while keeping other regions untouched is introduced.

Abstract

The recent developments in neural fields have brought phenomenal capabilities to the field of shape generation, but they lack crucial properties, such as incremental control - a fundamental requirement for artistic work. Triangular meshes, on the other hand, are the representation of choice for most geometry related tasks, offering efficiency and intuitive control, but do not lend themselves to neural optimization. To support downstream tasks, previous art typically proposes a two-step approach, where first a shape is generated using neural fields, and then a mesh is extracted for further processing. Instead, in this paper we introduce a hybrid approach that maintains both a mesh and a Signed Distance Field (SDF) representations consistently. Using this representation, we introduce MagicClay - an artist friendly tool for sculpting regions of a mesh according to textual prompts while keeping other regions untouched. Our framework carefully and efficiently balances consistency between the representations and regularizations in every step of the shape optimization; Relying on the mesh representation, we show how to render the SDF at higher resolutions and faster. In addition, we employ recent work in differentiable mesh reconstruction to adaptively allocate triangles in the mesh where required, as indicated by the SDF. Using an implemented prototype, we demonstrate superior generated geometry compared to the state-of-the-art, and novel consistent control, allowing sequential prompt-based edits to the same mesh for the first time.
Paper Structure (30 sections, 5 equations, 12 figures, 1 table)

This paper contains 30 sections, 5 equations, 12 figures, 1 table.

Figures (12)

  • Figure 1: Overview of the hybrid optimization. We jointly optimize a mesh, an SDF and a shared appearance MLP according to an input prompt. We can either optimize the full geometry, or only a user-selected portion of the mesh for an iterative 3D modeling workflow. We can also preserve existing textures on non-selected part of the mesh, or have the diffusion model generate textures for the full mesh. We start by differentiably rendering both representations, and enforcing their consistency. As they are kept in sync, we use the mesh to efficiently sample volumetric rays to render hi-res maps from the SDF in a memory-efficient manner. Applying SDS-type losses on these Hi-res renderings allows for capturing finer details. We sync the mesh and the SDF via multi-view consistency constraints on the RGB pixels, the image opacity, and the surface normals. The mesh local topology is updated according to the SDF using ROAR barda2023roar, splitting triangles where geometry is created and collapsing edges where needed. Additionally, we leverage representation-specific losses to regularize the optimization: an Eikonal loss on the SDF and a smoothness loss on the mesh.
  • Figure 2: Mesh and SDF robustness to noisy gradients. We optimize a mesh, an SDF and our hybrid representation with multi-view reconstruction losses after applying various noise levels to the ground truth renderings. We report the L2 reprojection error against novel-views ground truth renders. The SDF exhibits more robustness than the mesh to the high noise regime. We show the results for both the mesh (hybrid-mesh) and SDF (hybrid-SDF) in our hybrid representation. The hybrid-mesh significantly outperforms the mesh only baseline in the high noise regime.
  • Figure 3: Qualitative comparison on text-to-3D from scratch. To validate the benefit of our hybrid approach, we compare the quality of the triangular meshes extracted from HIFA zhu2023hifa, MVDreams shi2023mvdream, TextMesh tsalicoglou2023textmesh, Fantasia3D chen2023fantasia3d and ProlificDreamer wang2023prolificdreamer. For each prompt and method, we show the normal and RGB rendering on the top left, and the textureless mesh on the bottom right. While all methods produce realistic RGB renderings, only our hybrid representation generates smooth geometry, as highlighted by the red insets.
  • Figure 4: Ablation: no topology updates. Optimizing the mesh without topology update results in the final generated object being limited by the initial resolution. Left When starting with a fine mesh the optimization will often get stuck since each vertex has tiny effect on the objective: notice the sword is unable to grow its tip. Right When starting from a coarse mesh, no fine details can be created.
  • Figure 5: Ablation on high resolution renderings.Without our scheme to render SDF in high resolution by using the mesh counter part to localize samples along the ray, we default to regular low resolution SDS on the SDF renderings, leading to poorer quality in the generated shapes.
  • ...and 7 more figures