Table of Contents
Fetching ...

NOVA-3D: Non-overlapped Views for 3D Anime Character Reconstruction

Hongsheng Wang, Nanjie Yao, Xinrui Zhou, Shengyu Zhang, Huahao Xu, Fei Wu, Feng Lin

TL;DR

This work tackles the problem of reconstructing full-body 3D anime characters from non-overlapped front and back views, addressing data scarcity and the inapplicability of conventional NeRF-based methods. It introduces NOVA-3D, a GAN-based pipeline that uses a dual-viewpoint encoder and direction-aware attention to synthesize full-body 3D characters via tri-plane representations, trained with the NOVA-Human dataset of 10.2k models and 163.2k images with calibrated camera parameters. A composite loss comprising reconstruction, adversarial, and regularization terms enforces high fidelity and geometric consistency. Experiments show that NOVA-3D outperforms single-view and multi-view baselines in both head and full-body reconstruction, delivering richer details and fewer artifacts, and demonstrates strong generalization on the NOVA-Human data. The NOVA-Human dataset and accompanying code release aim to accelerate research and practical adoption for automated 3D anime character production.

Abstract

In the animation industry, 3D modelers typically rely on front and back non-overlapped concept designs to guide the 3D modeling of anime characters. However, there is currently a lack of automated approaches for generating anime characters directly from these 2D designs. In light of this, we explore a novel task of reconstructing anime characters from non-overlapped views. This presents two main challenges: existing multi-view approaches cannot be directly applied due to the absence of overlapping regions, and there is a scarcity of full-body anime character data and standard benchmarks. To bridge the gap, we present Non-Overlapped Views for 3D \textbf{A}nime Character Reconstruction (NOVA-3D), a new framework that implements a method for view-aware feature fusion to learn 3D-consistent features effectively and synthesizes full-body anime characters from non-overlapped front and back views directly. To facilitate this line of research, we collected the NOVA-Human dataset, which comprises multi-view images and accurate camera parameters for 3D anime characters. Extensive experiments demonstrate that the proposed method outperforms baseline approaches, achieving superior reconstruction of anime characters with exceptional detail fidelity. In addition, to further verify the effectiveness of our method, we applied it to the animation head reconstruction task and improved the state-of-the-art baseline to 94.453 in SSIM, 7.726 in LPIPS, and 19.575 in PSNR on average. Codes and datasets are available at https://wanghongsheng01.github.io/NOVA-3D/.

NOVA-3D: Non-overlapped Views for 3D Anime Character Reconstruction

TL;DR

This work tackles the problem of reconstructing full-body 3D anime characters from non-overlapped front and back views, addressing data scarcity and the inapplicability of conventional NeRF-based methods. It introduces NOVA-3D, a GAN-based pipeline that uses a dual-viewpoint encoder and direction-aware attention to synthesize full-body 3D characters via tri-plane representations, trained with the NOVA-Human dataset of 10.2k models and 163.2k images with calibrated camera parameters. A composite loss comprising reconstruction, adversarial, and regularization terms enforces high fidelity and geometric consistency. Experiments show that NOVA-3D outperforms single-view and multi-view baselines in both head and full-body reconstruction, delivering richer details and fewer artifacts, and demonstrates strong generalization on the NOVA-Human data. The NOVA-Human dataset and accompanying code release aim to accelerate research and practical adoption for automated 3D anime character production.

Abstract

In the animation industry, 3D modelers typically rely on front and back non-overlapped concept designs to guide the 3D modeling of anime characters. However, there is currently a lack of automated approaches for generating anime characters directly from these 2D designs. In light of this, we explore a novel task of reconstructing anime characters from non-overlapped views. This presents two main challenges: existing multi-view approaches cannot be directly applied due to the absence of overlapping regions, and there is a scarcity of full-body anime character data and standard benchmarks. To bridge the gap, we present Non-Overlapped Views for 3D \textbf{A}nime Character Reconstruction (NOVA-3D), a new framework that implements a method for view-aware feature fusion to learn 3D-consistent features effectively and synthesizes full-body anime characters from non-overlapped front and back views directly. To facilitate this line of research, we collected the NOVA-Human dataset, which comprises multi-view images and accurate camera parameters for 3D anime characters. Extensive experiments demonstrate that the proposed method outperforms baseline approaches, achieving superior reconstruction of anime characters with exceptional detail fidelity. In addition, to further verify the effectiveness of our method, we applied it to the animation head reconstruction task and improved the state-of-the-art baseline to 94.453 in SSIM, 7.726 in LPIPS, and 19.575 in PSNR on average. Codes and datasets are available at https://wanghongsheng01.github.io/NOVA-3D/.
Paper Structure (35 sections, 7 equations, 17 figures, 1 table)

This paper contains 35 sections, 7 equations, 17 figures, 1 table.

Figures (17)

  • Figure 1: Left: NOVA-3D achieves full-body anime character reconstruction from non-overlapped views. Right: The results of NOVA-3D on head reconstruction of anime characters, with exquisite texture details, clear contours as well as 3D consistency.
  • Figure 2: The overall pipeline of NOVA-3D. NOVA-3D utilizes front and rear viewpoint images as input. The dual-viewpoint encoder extracts features from the images, which are then used by the generator to produce two tri-planes. The tri-planes are sampled to obtain sampling features, and the direction-aware attention module is employed to fuse the features. Finally, the reconstruction loss and GAN loss modules are used to calculate the overall loss.
  • Figure 3: Some examples of front and back images, where the front image contains more high-frequency information and the back image contains more low-frequency information.
  • Figure 4: Directly concatenating the features of the front and back view will lead to ghost face in the back view.
  • Figure 5: Top: The reconstruction results of NOVA-3D; Bottom: The reconstruction results of PAniC-3D. NOVA-3D performs better than PAniC-3D in reconstruction on the same character.
  • ...and 12 more figures