SYM3D: Learning Symmetric Triplanes for Better 3D-Awareness of GANs

Jing Yang; Kyle Fogarty; Fangcheng Zhong; Cengiz Oztireli

SYM3D: Learning Symmetric Triplanes for Better 3D-Awareness of GANs

Jing Yang, Kyle Fogarty, Fangcheng Zhong, Cengiz Oztireli

TL;DR

SYM3D addresses the challenge of learning high-fidelity 3D assets from single 2D views without camera poses by introducing symmetry-aware triplanes. The method decouples geometry and texture into separate triplanes and augments them with view-wise spatial attention and reflectional symmetry regularization, enabling consistent orientation and improved detail across shapes. Empirical results on ShapeNet and ABO-Chair show SYM3D surpasses GET3D and OP3D in geometry and texture quality, and exhibits robustness to incomplete views and artifacts in text-to-3D settings. This approach highlights the practical value of structural priors, particularly symmetry, for data-efficient 3D-aware generation in real-world, pose-unknown scenarios.

Abstract

Despite the growing success of 3D-aware GANs, which can be trained on 2D images to generate high-quality 3D assets, they still rely on multi-view images with camera annotations to synthesize sufficient details from all viewing directions. However, the scarce availability of calibrated multi-view image datasets, especially in comparison to single-view images, has limited the potential of 3D GANs. Moreover, while bypassing camera pose annotations with a camera distribution constraint reduces dependence on exact camera parameters, it still struggles to generate a consistent orientation of 3D assets. To this end, we propose SYM3D, a novel 3D-aware GAN designed to leverage the prevalent reflectional symmetry structure found in natural and man-made objects, alongside a proposed view-aware spatial attention mechanism in learning the 3D representation. We evaluate SYM3D on both synthetic (ShapeNet Chairs, Cars, and Airplanes) and real-world datasets (ABO-Chair), demonstrating its superior performance in capturing detailed geometry and texture, even when trained on only single-view images. Finally, we demonstrate the effectiveness of incorporating symmetry regularization in helping reduce artifacts in the modeling of 3D assets in the text-to-3D task. Project is at \url{https://jingyang2017.github.io/sym3d.github.io/}

SYM3D: Learning Symmetric Triplanes for Better 3D-Awareness of GANs

TL;DR

Abstract

Paper Structure (13 sections, 8 equations, 9 figures, 4 tables)

This paper contains 13 sections, 8 equations, 9 figures, 4 tables.

Introduction
Related work
Method
Triplane Representation of 3D Assets
View-wise Spatial Attention
Reflectional Symmetry Regularization
Training Objectives
Experiment
Settings
Main Results
Properties of Learned Triplane
Further Analysis
Conclusion

Figures (9)

Figure 1: Comparison of shapes generated by GET3D get3d and our SYM3D, rendered in Blender. SYM3D learns symmetric triplanes for improving 3D-awareness of GANs. Compared to GET3D, SYM3D can synthesize diverse objects with reasonable geometry and texture after training it on datasets with incomplete views. Refer Section \ref{['datasets']} for dataset details.
Figure 2: Overview of proposed SYM3D. Random input vectors $z_{g}$ and $z_{t}$ are first mapped to a latent space ($w_g$ and $w_t$) and then fed into a shared generator to create the axis-aligned triplanes: geometry triplane $G$ and texture triplane $T$. We assume that the shapes being modeled have a symmetry plane ($XY$) such that a subset of the axis-aligned planes ($YZ,XZ$) can be regularized to exploit such symmetry. We apply view-wise attention (Section \ref{['sec:sve']}) on geometry triplane, and regulate both geometry triplane and attention map with reflectional symmetry (Section \ref{['sec:sr']}). We use DMTet method dmtet to extract a 3D mesh. We describe a surface point $p$ with both the original and its reflective feature in texture triplane. Using differentiable rendering laine2020modular, we render RGB images and their silhouettes from different camera angles. We then use two discriminators to determine whether the RGB and silhouette images are real or fake, without requiring the camera pose of real images.
Figure 3: Illustration of proposed view-wise spatial attention (VSA) module. This module analyzes each plane individually, utilizing spatial features as guidance for attention.
Figure 4: Qualitative comparison of SYM3D against OP3D and GET3D on generated images. SYM3D produces images with sharp details and a high diversity of shapes.
Figure 5: Rendered RGB images across different camera views.
...and 4 more figures

SYM3D: Learning Symmetric Triplanes for Better 3D-Awareness of GANs

TL;DR

Abstract

SYM3D: Learning Symmetric Triplanes for Better 3D-Awareness of GANs

Authors

TL;DR

Abstract

Table of Contents

Figures (9)