Table of Contents
Fetching ...

CNS-Edit: 3D Shape Editing via Coupled Neural Shape Optimization

Jingyu Hu, Ka-Hei Hui, Zhengzhe Liu, Hao Zhang, Chi-Wing Fu

TL;DR

CNS-Edit introduces a Coupled Neural Shape (CNS) representation that pairs a global latent code $z$ with a 3D neural feature volume $F$ to enable fine-grained, topology-aware 3D shape editing in latent space. Edits are realized by a coupled neural shape optimization that converts user operations (copy, resize, delete, drag) into operator-specific objectives $\,L_{op}$ and iteratively updates $(z,F)$ for $N$ steps before decoding the updated $z_N$ to the edited shape. The approach leverages a wavelet-based input representation and a diffusion-based autoencoder to extract $z$, while $F$ is obtained from intermediate diffusion-U-Net features, with a tight coupling allowing backpropagation of edits between $F$ and $z$. Quantitative metrics (FID, KID, QS, MS) and qualitative results show CNS-Edit achieves higher fidelity and semantic-consistent edits, including topology changes, compared with state-of-the-art methods. Limitations include reliance on category-specific latent spaces and modest per-operation speed, with future work aiming at broader shape categories, faster reconstruction, and extended editing operators.

Abstract

This paper introduces a new approach based on a coupled representation and a neural volume optimization to implicitly perform 3D shape editing in latent space. This work has three innovations. First, we design the coupled neural shape (CNS) representation for supporting 3D shape editing. This representation includes a latent code, which captures high-level global semantics of the shape, and a 3D neural feature volume, which provides a spatial context to associate with the local shape changes given by the editing. Second, we formulate the coupled neural shape optimization procedure to co-optimize the two coupled components in the representation subject to the editing operation. Last, we offer various 3D shape editing operators, i.e., copy, resize, delete, and drag, and derive each into an objective for guiding the CNS optimization, such that we can iteratively co-optimize the latent code and neural feature volume to match the editing target. With our approach, we can achieve a rich variety of editing results that are not only aware of the shape semantics but are also not easy to achieve by existing approaches. Both quantitative and qualitative evaluations demonstrate the strong capabilities of our approach over the state-of-the-art solutions.

CNS-Edit: 3D Shape Editing via Coupled Neural Shape Optimization

TL;DR

CNS-Edit introduces a Coupled Neural Shape (CNS) representation that pairs a global latent code with a 3D neural feature volume to enable fine-grained, topology-aware 3D shape editing in latent space. Edits are realized by a coupled neural shape optimization that converts user operations (copy, resize, delete, drag) into operator-specific objectives and iteratively updates for steps before decoding the updated to the edited shape. The approach leverages a wavelet-based input representation and a diffusion-based autoencoder to extract , while is obtained from intermediate diffusion-U-Net features, with a tight coupling allowing backpropagation of edits between and . Quantitative metrics (FID, KID, QS, MS) and qualitative results show CNS-Edit achieves higher fidelity and semantic-consistent edits, including topology changes, compared with state-of-the-art methods. Limitations include reliance on category-specific latent spaces and modest per-operation speed, with future work aiming at broader shape categories, faster reconstruction, and extended editing operators.

Abstract

This paper introduces a new approach based on a coupled representation and a neural volume optimization to implicitly perform 3D shape editing in latent space. This work has three innovations. First, we design the coupled neural shape (CNS) representation for supporting 3D shape editing. This representation includes a latent code, which captures high-level global semantics of the shape, and a 3D neural feature volume, which provides a spatial context to associate with the local shape changes given by the editing. Second, we formulate the coupled neural shape optimization procedure to co-optimize the two coupled components in the representation subject to the editing operation. Last, we offer various 3D shape editing operators, i.e., copy, resize, delete, and drag, and derive each into an objective for guiding the CNS optimization, such that we can iteratively co-optimize the latent code and neural feature volume to match the editing target. With our approach, we can achieve a rich variety of editing results that are not only aware of the shape semantics but are also not easy to achieve by existing approaches. Both quantitative and qualitative evaluations demonstrate the strong capabilities of our approach over the state-of-the-art solutions.
Paper Structure (21 sections, 2 equations, 11 figures, 2 tables)

This paper contains 21 sections, 2 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: We propose a novel coupled neural shape representation, equipped with a family of user-friendly shape editing operators: (i) drag (first column), (ii) delete (second column), (iii) copy (third column), and (iv) resize (fourth column). The top row shows the input shapes and operators, whereas the bottom row shows the edited results.
  • Figure 2: Overview of our framework. (a) We propose a new coupled neural shape (CNS) representation, consisting of latent code $z$ and neural feature volume $F$. From a given shape, we first adopt an encoder network to derive its global latent code $z$. Then code $z$ is fed into the Diffusion U-Net to extract intermediate features, from which we obtain the neural feature volume $F$. Notice that code $z$ and neural volume $F$ are closely coupled. Next, we provide (b) a family of operators, i.e., copy, resize, delete, and drag, for shape editing, and (c) transform the operator into an objective for guiding the iterative co-optimization of $z$ and $F$. After $N$ iterations of co-optimization, (d) we can obtain the updated latent code $z_N$ and decode it to produce the edited shape.
  • Figure 3: Shape editing operators: copy, resize, delete, and drag. Note the fidelity of the edited shapes produced by our method.
  • Figure 4: The cut-paste operator combines the copy and delete operators.
  • Figure 5: Visual results from the ablation study. Using features closer to the output when constructing neural volume $F$ introduces artifacts in the edited shapes, as seen in (e). However, features from too deep layers lack spatial context, resulting in less effective editing, as evident in (c). Further, applying our operators directly in the spatial domain leads to a loss of shape semantics during editing, compare (l) & (m), and also causes artifacts in the edited shapes, noticeable in (i) and (m) vs. (h) and (l), correspondingly.
  • ...and 6 more figures