Table of Contents
Fetching ...

Manipulating Vehicle 3D Shapes through Latent Space Editing

JiangDong Miao, Tatsuya Ikeda, Bisser Raytchev, Ryota Mizoguchi, Takenori Hiraoka, Takuji Nakashima, Keigo Shimizu, Toru Higaki, Kazufumi Kaneda

TL;DR

This work addresses the lack of fine-grained editing for existing 3D vehicles by learning latent-space editing directions in a modified DeepSDF framework. A pre-trained regressor maps latent codes to geometry and style attributes, enabling continuous edits via directions $\mathbf{d}_i$ and a perturbation $\epsilon$, optimized with a loss $\mathcal{L}=\lambda_1\mathcal{L}_{reg}+\lambda_2\mathcal{L}_{content}$ to balance attribute changes with identity preservation. The approach introduces a Position Enhanced DeepSDF to incorporate NeRF-style position embeddings, improving detail in reconstructed shapes, and provides two latent editors, a 4-layer MLP and a Kolmogorov-Arnold Network (KAN), to generate edited latent codes $\mathbf{z}'$ from $\mathbf{z}$ and $\epsilon$. Experimental results demonstrate accurate geometry edits, style edits, and multi-attribute editing on a dataset of 180 vehicles, with latent codes showing meaningful semantic structure via $t$-SNE and maintained identity across edits. The framework enables precise, data-efficient editing of real 3D objects, with potential impact on vehicle design workflows and aerodynamic studies.

Abstract

Although 3D object editing has the potential to significantly influence various industries, recent research in 3D generation and editing has primarily focused on converting text and images into 3D models, often overlooking the need for fine-grained control over the editing of existing 3D objects. This paper introduces a framework that employs a pre-trained regressor, enabling continuous, precise, attribute-specific modifications to both the stylistic and geometric attributes of vehicle 3D models. Our method not only preserves the inherent identity of vehicle 3D objects, but also supports multi-attribute editing, allowing for extensive customization without compromising the model's structural integrity. Experimental results demonstrate the efficacy of our approach in achieving detailed edits on various vehicle 3D models.

Manipulating Vehicle 3D Shapes through Latent Space Editing

TL;DR

This work addresses the lack of fine-grained editing for existing 3D vehicles by learning latent-space editing directions in a modified DeepSDF framework. A pre-trained regressor maps latent codes to geometry and style attributes, enabling continuous edits via directions and a perturbation , optimized with a loss to balance attribute changes with identity preservation. The approach introduces a Position Enhanced DeepSDF to incorporate NeRF-style position embeddings, improving detail in reconstructed shapes, and provides two latent editors, a 4-layer MLP and a Kolmogorov-Arnold Network (KAN), to generate edited latent codes from and . Experimental results demonstrate accurate geometry edits, style edits, and multi-attribute editing on a dataset of 180 vehicles, with latent codes showing meaningful semantic structure via -SNE and maintained identity across edits. The framework enables precise, data-efficient editing of real 3D objects, with potential impact on vehicle design workflows and aerodynamic studies.

Abstract

Although 3D object editing has the potential to significantly influence various industries, recent research in 3D generation and editing has primarily focused on converting text and images into 3D models, often overlooking the need for fine-grained control over the editing of existing 3D objects. This paper introduces a framework that employs a pre-trained regressor, enabling continuous, precise, attribute-specific modifications to both the stylistic and geometric attributes of vehicle 3D models. Our method not only preserves the inherent identity of vehicle 3D objects, but also supports multi-attribute editing, allowing for extensive customization without compromising the model's structural integrity. Experimental results demonstrate the efficacy of our approach in achieving detailed edits on various vehicle 3D models.

Paper Structure

This paper contains 23 sections, 7 equations, 12 figures, 2 tables.

Figures (12)

  • Figure 1: Overview of the editing flow for the 3D vehicle model proposed in this paper. During the editing process, users can specify a parameter (either geometric or stylistic) they wish to edit, along with the desired intensity of the edit. This parameter, representing the editing requirements, is then fed into the trained latent editor. The module adjusts the original latent code of the vehicle model according to the specified editing requirements and returns a new latent code. The updated latent code is input into the pre-trained DeepSDF, which maps it to signed distance function (SDF) values. These SDF values are subsequently processed through the marching cubes algorithm to produce the new 3D vehicle model.
  • Figure 2: The Training Strategy for the Latent Code Editor. In our training pipeline, the latent code $\mathbf{z}$ is sampled from a normal distribution. The elements $\{\mathbf{d}_1, \mathbf{d}_2, \ldots, \mathbf{d}_n\}$ represent trainable modules that denote directions in the latent space, and $\epsilon$ denotes the attributes to be edited along with their transformation magnitudes. The attribute values of the original and transformed latent codes are predicted by a pre-trained regressor. The regression loss $L_{\text{reg}}$ and the identity loss $L_{\text{context}}$ are employed to update the parameters. The details of the training flow are explained in \ref{['sec:trainflow']}.
  • Figure 3: Position Enhanced DeepSDF
  • Figure 4: 3D Reconstruction quality
  • Figure 5: The improvement of position embedding
  • ...and 7 more figures