Snapmoji: Instant Generation of Animatable Dual-Stylized Avatars
Eric M. Chen, Di Liu, Sizhuo Ma, Michael Vasilkovsky, Bing Zhou, Qiang Gao, Wenzhou Wang, Jiahao Luo, Dimitris N. Metaxas, Vincent Sitzmann, Jian Wang
TL;DR
This work addresses the demand for expressive, animatable avatars by introducing Snapmoji, a two-stage system that first converts a selfie into a primary-styled 2D avatar via Gaussian Domain Adaptation and diffusion, then lifts this result into a 3D Gaussian avatar capable of dynamic animation. The core contribution is the Gaussian Domain Adaptation framework, which leverages 3D priors from Objaverse to produce high-fidelity primary-style avatars and preserve identity. A second contribution is a 3D animation pipeline that combines 3DMM and FACS blendshapes within a cross-attention generator to deliver real-time, expressive avatars on mobile devices, with a WebGL rendering backend. The system demonstrates superior 2D stylization quality, robust 3D geometry, and real-time performance (0.9 s per selfie, 30–40 FPS on mobile), offering a practical tool for instant, dual-stylized avatar creation in AR and social applications.
Abstract
The increasing popularity of personalized avatar systems, such as Snapchat Bitmojis and Apple Memojis, highlights the growing demand for digital self-representation. Despite their widespread use, existing avatar platforms face significant limitations, including restricted expressivity due to predefined assets, tedious customization processes, or inefficient rendering requirements. Addressing these shortcomings, we introduce Snapmoji, an avatar generation system that instantly creates animatable, dual-stylized avatars from a selfie. We propose Gaussian Domain Adaptation (GDA), which is pre-trained on large-scale Gaussian models using 3D data from sources such as Objaverse and fine-tuned with 2D style transfer tasks, endowing it with a rich 3D prior. This enables Snapmoji to transform a selfie into a primary stylized avatar, like the Bitmoji style, and apply a secondary style, such as Plastic Toy or Alien, all while preserving the user's identity and the primary style's integrity. Our system is capable of producing 3D Gaussian avatars that support dynamic animation, including accurate facial expression transfer. Designed for efficiency, Snapmoji achieves selfie-to-avatar conversion in just 0.9 seconds and supports real-time interactions on mobile devices at 30 to 40 frames per second. Extensive testing confirms that Snapmoji outperforms existing methods in versatility and speed, making it a convenient tool for automatic avatar creation in various styles.
