Two-stage Synthetic Supervising and Multi-view Consistency Self-supervising based Animal 3D Reconstruction by Single Image
Zijian Kuang, Lihang Ying, Shi Jin, Li Cheng
TL;DR
This work tackles the challenge of 3D animal reconstruction from single images by introducing a two-stage framework that first trains on synthetic bird models using a pixel-aligned implicit function with differentiable rendering, then fine-tunes on real imagery with 2D self-supervision and silhouettes. A 2D multi-view consistency loss enforces view-invariant reconstructions, while a transfer-learning strategy adapts the model to real data. The approach yields superior results for bird 3D digitization and generalizes to other animals such as horses, cows, bears, and dogs, outperforming state-of-the-art supervised methods in both shape and texture reconstruction. The methodology leverages synthetic data to bootstrap learning and uses real-world silhouettes to bridge domain gaps, offering a practical pathway for animal 3D digitization from single-view images.
Abstract
Pixel-aligned Implicit Function (PIFu) effectively captures subtle variations in body shape within a low-dimensional space through extensive training with human 3D scans, its application to live animals presents formidable challenges due to the difficulty of obtaining animal cooperation for 3D scanning. To address this challenge, we propose the combination of two-stage supervised and self-supervised training to address the challenge of obtaining animal cooperation for 3D scanning. In the first stage, we leverage synthetic animal models for supervised learning. This allows the model to learn from a diverse set of virtual animal instances. In the second stage, we use 2D multi-view consistency as a self-supervised training method. This further enhances the model's ability to reconstruct accurate and realistic 3D shape and texture from largely available single-view images of real animals. The results of our study demonstrate that our approach outperforms state-of-the-art methods in both quantitative and qualitative aspects of bird 3D digitization. The source code is available at https://github.com/kuangzijian/drifu-for-animals.
