OMEGA-Avatar: One-shot Modeling of 360° Gaussian Avatars
Zehao Xia, Yiqun Wang, Zhengda Lu, Kai Liu, Jun Xiao, Peter Wonka
TL;DR
OMEGA-Avatar tackles the challenge of producing animatable, full-head 3D avatars from a single image by combining diffusion-guided semantic-aware FLAME mesh deformation with a dual Gaussian head decoded from a canonical UV representation. It introduces semantic-aware topology-preserving deformation to capture hair and unseen regions, and a multi-view feature splatting module that fuses features from generated multi-view views into a shared UV map for stable, view-consistent Gaussian decoding. The method achieves state-of-the-art performance in 360° full-head completeness and identity preservation on NeRSemble and Avatar-256, while remaining fully feed-forward without per-subject optimization. This approach enables efficient, one-shot creation of high-fidelity, animatable avatars suitable for real-time rendering and wide deployment across applications requiring robust 3D head modeling.
Abstract
Creating high-fidelity, animatable 3D avatars from a single image remains a formidable challenge. We identified three desirable attributes of avatar generation: 1) the method should be feed-forward, 2) model a 360° full-head, and 3) should be animation-ready. However, current work addresses only two of the three points simultaneously. To address these limitations, we propose OMEGA-Avatar, the first feed-forward framework that simultaneously generates a generalizable, 360°-complete, and animatable 3D Gaussian head from a single image. Starting from a feed-forward and animatable framework, we address the 360° full-head avatar generation problem with two novel components. First, to overcome poor hair modeling in full-head avatar generation, we introduce a semantic-aware mesh deformation module that integrates multi-view normals to optimize a FLAME head with hair while preserving its topology structure. Second, to enable effective feed-forward decoding of full-head features, we propose a multi-view feature splatting module that constructs a shared canonical UV representation from features across multiple views through differentiable bilinear splatting, hierarchical UV mapping, and visibility-aware fusion. This approach preserves both global structural coherence and local high-frequency details across all viewpoints, ensuring 360° consistency without per-instance optimization. Extensive experiments demonstrate that OMEGA-Avatar achieves state-of-the-art performance, significantly outperforming existing baselines in 360° full-head completeness while robustly preserving identity across different viewpoints.
