2DGS-Avatar: Animatable High-fidelity Clothed Avatar via 2D Gaussian Splatting
Qipeng Yan, Mingyang Sun, Lihua Zhang
TL;DR
This work introduces 2DGS-Avatar, a real-time, high-fidelity clothed avatar reconstruction method from monocular RGB videos using 2D Gaussian Splatting. It initializes 2D Gaussian primitives on a SMPL-X canonical surface, applies forward skinning to pose space, and renders with a differentiable 2DGS rasterizer, supervised by RGB and normal maps, complemented by self-supervised area regularization and eccentricity filtering to improve surface distribution and geometry. The approach bridges fast training and rendering with detailed clothing geometry, achieving competitive quantitative results against 3DGS-based methods while significantly reducing training time and memory usage, and enabling ~60 FPS rendering on consumer GPUs. Experiments on AvatarRex and THuman4.0 demonstrate strong qualitative and quantitative performance, with ablations confirming the effectiveness of each proposed component. The work highlights practical impact for AR/VR and dynamic character capture, while noting limitations in motion-induced wrinkles and underrepresented regions that warrant future garment-modeling enhancements.
Abstract
Real-time rendering of high-fidelity and animatable avatars from monocular videos remains a challenging problem in computer vision and graphics. Over the past few years, the Neural Radiance Field (NeRF) has made significant progress in rendering quality but behaves poorly in run-time performance due to the low efficiency of volumetric rendering. Recently, methods based on 3D Gaussian Splatting (3DGS) have shown great potential in fast training and real-time rendering. However, they still suffer from artifacts caused by inaccurate geometry. To address these problems, we propose 2DGS-Avatar, a novel approach based on 2D Gaussian Splatting (2DGS) for modeling animatable clothed avatars with high-fidelity and fast training performance. Given monocular RGB videos as input, our method generates an avatar that can be driven by poses and rendered in real-time. Compared to 3DGS-based methods, our 2DGS-Avatar retains the advantages of fast training and rendering while also capturing detailed, dynamic, and photo-realistic appearances. We conduct abundant experiments on popular datasets such as AvatarRex and THuman4.0, demonstrating impressive performance in both qualitative and quantitative metrics.
