Table of Contents
Fetching ...

Unsupervised Cross-Domain Regression for Fine-grained 3D Game Character Reconstruction

Qi Wen, Xiang Wen, Hao Jiang, Siqi Yang, Bingfeng Han, Tianlei Hu, Gang Chen, Shuang Li

TL;DR

The paper tackles cross-domain reconstruction of fine-grained 3D game characters from single-view images by introducing an unsupervised, end-to-end framework that bridges real-world and game domains. It employs a shared encoder/decoder/regressor and a differentiable game-engine imitator, augmented with a domain loss based on maximum mean discrepancy ($MMD$) and a contrastive loss in parameter space, plus identity-oriented auxiliary extractors. A joint objective $\mathcal{L} = \lambda_1 L_{restored} + \lambda_2 L_{domain} + \lambda_3 L_{contrastive} + \lambda_4 L_{consistency}$ guides training, yielding state-of-the-art quantitative results and robust qualitative performance. The method demonstrates strong practical impact by enabling scalable, photorealistic 3D avatar reconstruction in real games (e.g., code-named J1) for metaverse-like experiences.

Abstract

With the rise of the ``metaverse'' and the rapid development of games, it has become more and more critical to reconstruct characters in the virtual world faithfully. The immersive experience is one of the most central themes of the ``metaverse'', while the reducibility of the avatar is the crucial point. Meanwhile, the game is the carrier of the metaverse, in which players can freely edit the facial appearance of the game character. In this paper, we propose a simple but powerful cross-domain framework that can reconstruct fine-grained 3D game characters from single-view images in an end-to-end manner. Different from the previous methods, which do not resolve the cross-domain gap, we propose an effective regressor that can greatly reduce the discrepancy between the real-world domain and the game domain. To figure out the drawbacks of no ground truth, our unsupervised framework has accomplished the knowledge transfer of the target domain. Additionally, an innovative contrastive loss is proposed to solve the instance-wise disparity, which keeps the person-specific details of the reconstructed character. In contrast, an auxiliary 3D identity-aware extractor is activated to make the results of our model more impeccable. Then a large set of physically meaningful facial parameters is generated robustly and exquisitely. Experiments demonstrate that our method yields state-of-the-art performance in 3D game character reconstruction.

Unsupervised Cross-Domain Regression for Fine-grained 3D Game Character Reconstruction

TL;DR

The paper tackles cross-domain reconstruction of fine-grained 3D game characters from single-view images by introducing an unsupervised, end-to-end framework that bridges real-world and game domains. It employs a shared encoder/decoder/regressor and a differentiable game-engine imitator, augmented with a domain loss based on maximum mean discrepancy () and a contrastive loss in parameter space, plus identity-oriented auxiliary extractors. A joint objective guides training, yielding state-of-the-art quantitative results and robust qualitative performance. The method demonstrates strong practical impact by enabling scalable, photorealistic 3D avatar reconstruction in real games (e.g., code-named J1) for metaverse-like experiences.

Abstract

With the rise of the ``metaverse'' and the rapid development of games, it has become more and more critical to reconstruct characters in the virtual world faithfully. The immersive experience is one of the most central themes of the ``metaverse'', while the reducibility of the avatar is the crucial point. Meanwhile, the game is the carrier of the metaverse, in which players can freely edit the facial appearance of the game character. In this paper, we propose a simple but powerful cross-domain framework that can reconstruct fine-grained 3D game characters from single-view images in an end-to-end manner. Different from the previous methods, which do not resolve the cross-domain gap, we propose an effective regressor that can greatly reduce the discrepancy between the real-world domain and the game domain. To figure out the drawbacks of no ground truth, our unsupervised framework has accomplished the knowledge transfer of the target domain. Additionally, an innovative contrastive loss is proposed to solve the instance-wise disparity, which keeps the person-specific details of the reconstructed character. In contrast, an auxiliary 3D identity-aware extractor is activated to make the results of our model more impeccable. Then a large set of physically meaningful facial parameters is generated robustly and exquisitely. Experiments demonstrate that our method yields state-of-the-art performance in 3D game character reconstruction.

Paper Structure

This paper contains 15 sections, 12 equations, 10 figures, 6 tables.

Figures (10)

  • Figure 1: Reconstructing the character in the game is a task with both prospects and challenges. We propose an innovative framework to resolve domain-wise and instance-wise disparity to reconstruct fine-grained game character.
  • Figure 2: Network architectures. Our model consists of four parts: an encoder $E$ and a decoder $D$, a regressor $R$, an imitator $G$, auxiliary extractor $F_{3d}$ and $F_{id}$. Four loss functions are applied: restored loss (Eq. \ref{['eq:restored']}), domain loss (Eq. \ref{['eq:domain']}), contrastive loss (Eq. \ref{['eq:contrastive']}), consistency loss (Eq. \ref{['eq:consistency']}).
  • Figure 3: Comparison of the results. We extensively compare four classic methods to prove our advancement. Our method retains all the similarity and identity with both global and local details.
  • Figure 4: Visualization the t-SNE embeddings of source and target parameters at different learning iterations. These results show that the domain gap is greatly reduced.
  • Figure 5: We make a robustness test: (a) Faces wearing sunglasses, (b) Wide-angled profile faces, (c) Different ages, (d) Mismatched gender. The experimental results prove that the robustness and generalization of our model can withstand challenges.
  • ...and 5 more figures