DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis
Yuming Gu, Phong Tran, Yujian Zheng, Hongyi Xu, Heyuan Li, Adilbek Karmanov, Hao Li
TL;DR
DiffPortrait360 addresses 360$^\circ$ view-consistent head synthesis from a single portrait and enables 3D-aware NeRF rendering for diverse subjects. It extends DiffPortrait3D with a back-view generator, a dual-appearance module, a back-view reference, and a view-consistency training regime based on continuous view sequences, all built on a frozen latent diffusion backbone. The approach achieves robust, locally continuous 360-degree consistency across human, stylized, and anthropomorphic heads, outperforming state-of-the-art methods on stylized and real portraits. This capability supports immersive telepresence and scalable personalized content creation by producing high-quality 3D-aware assets from single images.
Abstract
Generating high-quality 360-degree views of human heads from single-view images is essential for enabling accessible immersive telepresence applications and scalable personalized content creation. While cutting-edge methods for full head generation are limited to modeling realistic human heads, the latest diffusion-based approaches for style-omniscient head synthesis can produce only frontal views and struggle with view consistency, preventing their conversion into true 3D models for rendering from arbitrary angles. We introduce a novel approach that generates fully consistent 360-degree head views, accommodating human, stylized, and anthropomorphic forms, including accessories like glasses and hats. Our method builds on the DiffPortrait3D framework, incorporating a custom ControlNet for back-of-head detail generation and a dual appearance module to ensure global front-back consistency. By training on continuous view sequences and integrating a back reference image, our approach achieves robust, locally continuous view synthesis. Our model can be used to produce high-quality neural radiance fields (NeRFs) for real-time, free-viewpoint rendering, outperforming state-of-the-art methods in object synthesis and 360-degree head generation for very challenging input portraits.
