Table of Contents
Fetching ...

One-shot Generative Domain Adaptation in 3D GANs

Ziqiang Li, Yi Wu, Chaoyue Wang, Xue Rui, Bin Li

TL;DR

This paper tackles one-shot 3D Generative Domain Adaptation by transferring a pre-trained 3D EG3D-based generator to a new domain using a single reference image. It proposes 3D-Adapter, which restricts fine-tuning to the Tri-Plane Decoder ($\textbf{Tri-D}$) and the Style-based Super-resolution module ($\mathbf{G2}$), and employs four losses—$L_{dir}$, $L_{dis}$, $L_{I-str}$, and $L_{F-str}$—alongside a progressive two-step training scheme to achieve high fidelity, large diversity, cross-domain consistency, and multi-view consistency. The method demonstrates strong quantitative and qualitative gains over baselines on multiple target domains, and extends to zero-shot scenarios with competitive results, while preserving latent-interpolation, inversion, and editing capabilities. Limitations include imperfect cross-domain consistency in some cases and a focus on single-domain adaptation, with future work aiming at multi-domain integration and enhanced consistency constraints.

Abstract

3D-aware image generation necessitates extensive training data to ensure stable training and mitigate the risk of overfitting. This paper first considers a novel task known as One-shot 3D Generative Domain Adaptation (GDA), aimed at transferring a pre-trained 3D generator from one domain to a new one, relying solely on a single reference image. One-shot 3D GDA is characterized by the pursuit of specific attributes, namely, high fidelity, large diversity, cross-domain consistency, and multi-view consistency. Within this paper, we introduce 3D-Adapter, the first one-shot 3D GDA method, for diverse and faithful generation. Our approach begins by judiciously selecting a restricted weight set for fine-tuning, and subsequently leverages four advanced loss functions to facilitate adaptation. An efficient progressive fine-tuning strategy is also implemented to enhance the adaptation process. The synergy of these three technological components empowers 3D-Adapter to achieve remarkable performance, substantiated both quantitatively and qualitatively, across all desired properties of 3D GDA. Furthermore, 3D-Adapter seamlessly extends its capabilities to zero-shot scenarios, and preserves the potential for crucial tasks such as interpolation, reconstruction, and editing within the latent space of the pre-trained generator. Code will be available at https://github.com/iceli1007/3D-Adapter.

One-shot Generative Domain Adaptation in 3D GANs

TL;DR

This paper tackles one-shot 3D Generative Domain Adaptation by transferring a pre-trained 3D EG3D-based generator to a new domain using a single reference image. It proposes 3D-Adapter, which restricts fine-tuning to the Tri-Plane Decoder () and the Style-based Super-resolution module (), and employs four losses—, , , and —alongside a progressive two-step training scheme to achieve high fidelity, large diversity, cross-domain consistency, and multi-view consistency. The method demonstrates strong quantitative and qualitative gains over baselines on multiple target domains, and extends to zero-shot scenarios with competitive results, while preserving latent-interpolation, inversion, and editing capabilities. Limitations include imperfect cross-domain consistency in some cases and a focus on single-domain adaptation, with future work aiming at multi-domain integration and enhanced consistency constraints.

Abstract

3D-aware image generation necessitates extensive training data to ensure stable training and mitigate the risk of overfitting. This paper first considers a novel task known as One-shot 3D Generative Domain Adaptation (GDA), aimed at transferring a pre-trained 3D generator from one domain to a new one, relying solely on a single reference image. One-shot 3D GDA is characterized by the pursuit of specific attributes, namely, high fidelity, large diversity, cross-domain consistency, and multi-view consistency. Within this paper, we introduce 3D-Adapter, the first one-shot 3D GDA method, for diverse and faithful generation. Our approach begins by judiciously selecting a restricted weight set for fine-tuning, and subsequently leverages four advanced loss functions to facilitate adaptation. An efficient progressive fine-tuning strategy is also implemented to enhance the adaptation process. The synergy of these three technological components empowers 3D-Adapter to achieve remarkable performance, substantiated both quantitatively and qualitatively, across all desired properties of 3D GDA. Furthermore, 3D-Adapter seamlessly extends its capabilities to zero-shot scenarios, and preserves the potential for crucial tasks such as interpolation, reconstruction, and editing within the latent space of the pre-trained generator. Code will be available at https://github.com/iceli1007/3D-Adapter.

Paper Structure

This paper contains 26 sections, 16 equations, 15 figures, 5 tables.

Figures (15)

  • Figure 1: Training parameters determination. We do the ablation study on fine-tuning different components of the EG3D.
  • Figure 2: The overall generator architecture of EG3D (Top) and our 3D-Adapter (Bottom). The EG3D consists of two parts: the Synthesis Network, represented by a yellow box, and the Super-Resolution module, represented by a blue box. These correspond to the yellow component $G$ and the blue component $S$ in our 3D-Adapter. The proposed 3D-Adapter is designed to transfer the knowledge from EG3D's generator ($G_A$ and $S_A$), pre-trained on the source dataset, to the target domain ($G_B$ and $S_B$).
  • Figure 3: Training strategy determination. Ablation study on different fine-tuning strategies.
  • Figure 4: Qualitative comparisons on one-shot setting between our proposed method, DiFa zhang2022towards, and DoRM wu2023domain. The first row and first column show different images in source domains and reference images in target domains. Results best seen at 500% zoom.
  • Figure 5: Ablation study of different training losses.Red boxes indicate the difference between adaptive target images and corresponding source images. Results best seen at 500% zoom.
  • ...and 10 more figures