Table of Contents
Fetching ...

LayGA: Layered Gaussian Avatars for Animatable Clothing Transfer

Siyou Lin, Zhe Li, Zhaoqi Su, Zerong Zheng, Hongwen Zhang, Yebin Liu

TL;DR

LayGA introduces Layered Gaussian Avatars, a 3D Gaussian Splatting-based approach for animatable clothing transfer that separates the body and clothing into two layers. It employs a two-stage training regime—single-layer reconstruction with geometric constraints and clothing segmentation, followed by multi-layer fitting with separate body and clothing Gaussians and dedicated geometry/rendering supervision—to yield smooth surfaces and accurate garment tracking. Key contributions include geometric constraints for surface reconstruction, a clothing segmentation mechanism, a geometry-rendering layer separation to maintain rendering fidelity, and a collision-handling pipeline plus a test-time clothing transfer capability, enabling photorealistic virtual try-on across identities. The results demonstrate improved geometric reconstruction, realistic clothing dynamics, and robust clothing transfer, with practical impact for avatars and AR/VR applications.

Abstract

Animatable clothing transfer, aiming at dressing and animating garments across characters, is a challenging problem. Most human avatar works entangle the representations of the human body and clothing together, which leads to difficulties for virtual try-on across identities. What's worse, the entangled representations usually fail to exactly track the sliding motion of garments. To overcome these limitations, we present Layered Gaussian Avatars (LayGA), a new representation that formulates body and clothing as two separate layers for photorealistic animatable clothing transfer from multi-view videos. Our representation is built upon the Gaussian map-based avatar for its excellent representation power of garment details. However, the Gaussian map produces unstructured 3D Gaussians distributed around the actual surface. The absence of a smooth explicit surface raises challenges in accurate garment tracking and collision handling between body and garments. Therefore, we propose two-stage training involving single-layer reconstruction and multi-layer fitting. In the single-layer reconstruction stage, we propose a series of geometric constraints to reconstruct smooth surfaces and simultaneously obtain the segmentation between body and clothing. Next, in the multi-layer fitting stage, we train two separate models to represent body and clothing and utilize the reconstructed clothing geometries as 3D supervision for more accurate garment tracking. Furthermore, we propose geometry and rendering layers for both high-quality geometric reconstruction and high-fidelity rendering. Overall, the proposed LayGA realizes photorealistic animations and virtual try-on, and outperforms other baseline methods. Our project page is https://jsnln.github.io/layga/index.html.

LayGA: Layered Gaussian Avatars for Animatable Clothing Transfer

TL;DR

LayGA introduces Layered Gaussian Avatars, a 3D Gaussian Splatting-based approach for animatable clothing transfer that separates the body and clothing into two layers. It employs a two-stage training regime—single-layer reconstruction with geometric constraints and clothing segmentation, followed by multi-layer fitting with separate body and clothing Gaussians and dedicated geometry/rendering supervision—to yield smooth surfaces and accurate garment tracking. Key contributions include geometric constraints for surface reconstruction, a clothing segmentation mechanism, a geometry-rendering layer separation to maintain rendering fidelity, and a collision-handling pipeline plus a test-time clothing transfer capability, enabling photorealistic virtual try-on across identities. The results demonstrate improved geometric reconstruction, realistic clothing dynamics, and robust clothing transfer, with practical impact for avatars and AR/VR applications.

Abstract

Animatable clothing transfer, aiming at dressing and animating garments across characters, is a challenging problem. Most human avatar works entangle the representations of the human body and clothing together, which leads to difficulties for virtual try-on across identities. What's worse, the entangled representations usually fail to exactly track the sliding motion of garments. To overcome these limitations, we present Layered Gaussian Avatars (LayGA), a new representation that formulates body and clothing as two separate layers for photorealistic animatable clothing transfer from multi-view videos. Our representation is built upon the Gaussian map-based avatar for its excellent representation power of garment details. However, the Gaussian map produces unstructured 3D Gaussians distributed around the actual surface. The absence of a smooth explicit surface raises challenges in accurate garment tracking and collision handling between body and garments. Therefore, we propose two-stage training involving single-layer reconstruction and multi-layer fitting. In the single-layer reconstruction stage, we propose a series of geometric constraints to reconstruct smooth surfaces and simultaneously obtain the segmentation between body and clothing. Next, in the multi-layer fitting stage, we train two separate models to represent body and clothing and utilize the reconstructed clothing geometries as 3D supervision for more accurate garment tracking. Furthermore, we propose geometry and rendering layers for both high-quality geometric reconstruction and high-fidelity rendering. Overall, the proposed LayGA realizes photorealistic animations and virtual try-on, and outperforms other baseline methods. Our project page is https://jsnln.github.io/layga/index.html.
Paper Structure (22 sections, 18 equations, 9 figures, 1 table)

This paper contains 22 sections, 18 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: Overview of our pipeline. Our pipeline consists of two training stages: 1) single-layer reconstruction and segmentation; 2) Multi-layer fitting.
  • Figure 2: Illustration of the clothing-aware avatar representation.
  • Figure 3: Illustration of normal computation on the Gaussian map.
  • Figure 4: Illustration of geometric and rendering layers. $\epsilon$ is the threshold for handling collisions.
  • Figure 5: Our method enables animatable clothing transfer, and each row illustrates animation results with the same upper garment but different identities.
  • ...and 4 more figures