Learning 3D-Aware GANs from Unposed Images with Template Feature Field
Xinya Chen, Hanlei Guo, Yanrui Bin, Shangzhan Zhang, Yuanbo Yang, Yue Wang, Yujun Shen, Yiyi Liao
TL;DR
This paper introduces TeFF, a template feature field that enables 3D-aware GAN training from unposed images by jointly learning a 3D semantic template alongside the radiance field. Pose estimation for real images is performed on the fly via discretized camera poses and phase correlation, leveraging semantically aligned DINO features to recover full 3D geometry across challenging datasets. The approach uses a background 2D generator and dual discriminators to stabilize training, and EMA-derived template fields to facilitate pose matching. Across cars, planes, and elephants, TeFF outperforms state-of-the-art baselines in FID, depth accuracy, and pose distribution fidelity, demonstrating robust 360-degree rendering without ground-truth poses. The method advances scalable 3D-aware generative modeling for real-world, unposed image collections, with limitations including a single template per category and sensitivity to perspective distortions.
Abstract
Collecting accurate camera poses of training images has been shown to well serve the learning of 3D-aware generative adversarial networks (GANs) yet can be quite expensive in practice. This work targets learning 3D-aware GANs from unposed images, for which we propose to perform on-the-fly pose estimation of training images with a learned template feature field (TeFF). Concretely, in addition to a generative radiance field as in previous approaches, we ask the generator to also learn a field from 2D semantic features while sharing the density from the radiance field. Such a framework allows us to acquire a canonical 3D feature template leveraging the dataset mean discovered by the generative model, and further efficiently estimate the pose parameters on real data. Experimental results on various challenging datasets demonstrate the superiority of our approach over state-of-the-art alternatives from both the qualitative and the quantitative perspectives.
