CNS-Edit: 3D Shape Editing via Coupled Neural Shape Optimization
Jingyu Hu, Ka-Hei Hui, Zhengzhe Liu, Hao Zhang, Chi-Wing Fu
TL;DR
CNS-Edit introduces a Coupled Neural Shape (CNS) representation that pairs a global latent code $z$ with a 3D neural feature volume $F$ to enable fine-grained, topology-aware 3D shape editing in latent space. Edits are realized by a coupled neural shape optimization that converts user operations (copy, resize, delete, drag) into operator-specific objectives $\,L_{op}$ and iteratively updates $(z,F)$ for $N$ steps before decoding the updated $z_N$ to the edited shape. The approach leverages a wavelet-based input representation and a diffusion-based autoencoder to extract $z$, while $F$ is obtained from intermediate diffusion-U-Net features, with a tight coupling allowing backpropagation of edits between $F$ and $z$. Quantitative metrics (FID, KID, QS, MS) and qualitative results show CNS-Edit achieves higher fidelity and semantic-consistent edits, including topology changes, compared with state-of-the-art methods. Limitations include reliance on category-specific latent spaces and modest per-operation speed, with future work aiming at broader shape categories, faster reconstruction, and extended editing operators.
Abstract
This paper introduces a new approach based on a coupled representation and a neural volume optimization to implicitly perform 3D shape editing in latent space. This work has three innovations. First, we design the coupled neural shape (CNS) representation for supporting 3D shape editing. This representation includes a latent code, which captures high-level global semantics of the shape, and a 3D neural feature volume, which provides a spatial context to associate with the local shape changes given by the editing. Second, we formulate the coupled neural shape optimization procedure to co-optimize the two coupled components in the representation subject to the editing operation. Last, we offer various 3D shape editing operators, i.e., copy, resize, delete, and drag, and derive each into an objective for guiding the CNS optimization, such that we can iteratively co-optimize the latent code and neural feature volume to match the editing target. With our approach, we can achieve a rich variety of editing results that are not only aware of the shape semantics but are also not easy to achieve by existing approaches. Both quantitative and qualitative evaluations demonstrate the strong capabilities of our approach over the state-of-the-art solutions.
