Unsupervised Cross-Domain Regression for Fine-grained 3D Game Character Reconstruction
Qi Wen, Xiang Wen, Hao Jiang, Siqi Yang, Bingfeng Han, Tianlei Hu, Gang Chen, Shuang Li
TL;DR
The paper tackles cross-domain reconstruction of fine-grained 3D game characters from single-view images by introducing an unsupervised, end-to-end framework that bridges real-world and game domains. It employs a shared encoder/decoder/regressor and a differentiable game-engine imitator, augmented with a domain loss based on maximum mean discrepancy ($MMD$) and a contrastive loss in parameter space, plus identity-oriented auxiliary extractors. A joint objective $\mathcal{L} = \lambda_1 L_{restored} + \lambda_2 L_{domain} + \lambda_3 L_{contrastive} + \lambda_4 L_{consistency}$ guides training, yielding state-of-the-art quantitative results and robust qualitative performance. The method demonstrates strong practical impact by enabling scalable, photorealistic 3D avatar reconstruction in real games (e.g., code-named J1) for metaverse-like experiences.
Abstract
With the rise of the ``metaverse'' and the rapid development of games, it has become more and more critical to reconstruct characters in the virtual world faithfully. The immersive experience is one of the most central themes of the ``metaverse'', while the reducibility of the avatar is the crucial point. Meanwhile, the game is the carrier of the metaverse, in which players can freely edit the facial appearance of the game character. In this paper, we propose a simple but powerful cross-domain framework that can reconstruct fine-grained 3D game characters from single-view images in an end-to-end manner. Different from the previous methods, which do not resolve the cross-domain gap, we propose an effective regressor that can greatly reduce the discrepancy between the real-world domain and the game domain. To figure out the drawbacks of no ground truth, our unsupervised framework has accomplished the knowledge transfer of the target domain. Additionally, an innovative contrastive loss is proposed to solve the instance-wise disparity, which keeps the person-specific details of the reconstructed character. In contrast, an auxiliary 3D identity-aware extractor is activated to make the results of our model more impeccable. Then a large set of physically meaningful facial parameters is generated robustly and exquisitely. Experiments demonstrate that our method yields state-of-the-art performance in 3D game character reconstruction.
