Table of Contents
Fetching ...

FreeCloth: Free-form Generation Enhances Challenging Clothed Human Modeling

Hang Ye, Xiaoxuan Ma, Hai Ci, Wentao Zhu, Yizhou Wang

TL;DR

FreeCloth introduces a hybrid clothed-human modeling framework that splits the surface into unclothed, deformed, and generated regions, applying LBS-based deformation near the body and a free-form generator for distant loose garments. The method uses a garment-specific clothing-cut map and structure-aware pose encoding to condition the free-form generator, achieving state-of-the-art results on loose clothing datasets with improved visual fidelity and realism. Extensive ablations demonstrate the necessity of the clothing-cut map, the hybrid paradigm, and the part-aware conditioning for high-quality skirts and dresses, while maintaining efficiency. This approach enables more expressive and flexible avatar modeling, with potential extensions to multi-subject clothing, non-skirt garments, and temporal consistency.

Abstract

Achieving realistic animated human avatars requires accurate modeling of pose-dependent clothing deformations. Existing learning-based methods heavily rely on the Linear Blend Skinning (LBS) of minimally-clothed human models like SMPL to model deformation. However, they struggle to handle loose clothing, such as long dresses, where the canonicalization process becomes ill-defined when the clothing is far from the body, leading to disjointed and fragmented results. To overcome this limitation, we propose FreeCloth, a novel hybrid framework to model challenging clothed humans. Our core idea is to use dedicated strategies to model different regions, depending on whether they are close to or distant from the body. Specifically, we segment the human body into three categories: unclothed, deformed, and generated. We simply replicate unclothed regions that require no deformation. For deformed regions close to the body, we leverage LBS to handle the deformation. As for the generated regions, which correspond to loose clothing areas, we introduce a novel free-form, part-aware generator to model them, as they are less affected by movements. This free-form generation paradigm brings enhanced flexibility and expressiveness to our hybrid framework, enabling it to capture the intricate geometric details of challenging loose clothing, such as skirts and dresses. Experimental results on the benchmark dataset featuring loose clothing demonstrate that FreeCloth achieves state-of-the-art performance with superior visual fidelity and realism, particularly in the most challenging cases.

FreeCloth: Free-form Generation Enhances Challenging Clothed Human Modeling

TL;DR

FreeCloth introduces a hybrid clothed-human modeling framework that splits the surface into unclothed, deformed, and generated regions, applying LBS-based deformation near the body and a free-form generator for distant loose garments. The method uses a garment-specific clothing-cut map and structure-aware pose encoding to condition the free-form generator, achieving state-of-the-art results on loose clothing datasets with improved visual fidelity and realism. Extensive ablations demonstrate the necessity of the clothing-cut map, the hybrid paradigm, and the part-aware conditioning for high-quality skirts and dresses, while maintaining efficiency. This approach enables more expressive and flexible avatar modeling, with potential extensions to multi-subject clothing, non-skirt garments, and temporal consistency.

Abstract

Achieving realistic animated human avatars requires accurate modeling of pose-dependent clothing deformations. Existing learning-based methods heavily rely on the Linear Blend Skinning (LBS) of minimally-clothed human models like SMPL to model deformation. However, they struggle to handle loose clothing, such as long dresses, where the canonicalization process becomes ill-defined when the clothing is far from the body, leading to disjointed and fragmented results. To overcome this limitation, we propose FreeCloth, a novel hybrid framework to model challenging clothed humans. Our core idea is to use dedicated strategies to model different regions, depending on whether they are close to or distant from the body. Specifically, we segment the human body into three categories: unclothed, deformed, and generated. We simply replicate unclothed regions that require no deformation. For deformed regions close to the body, we leverage LBS to handle the deformation. As for the generated regions, which correspond to loose clothing areas, we introduce a novel free-form, part-aware generator to model them, as they are less affected by movements. This free-form generation paradigm brings enhanced flexibility and expressiveness to our hybrid framework, enabling it to capture the intricate geometric details of challenging loose clothing, such as skirts and dresses. Experimental results on the benchmark dataset featuring loose clothing demonstrate that FreeCloth achieves state-of-the-art performance with superior visual fidelity and realism, particularly in the most challenging cases.

Paper Structure

This paper contains 31 sections, 13 equations, 26 figures, 6 tables.

Figures (26)

  • Figure 1: (a) An overview of our framework for modeling clothed humans. Based on the specific modeling needs of different regions, we employ a dedicated strategy to handle various clothing areas. Specifically, for loose regions (green) that are less affected by body movements and require more freedom, we propose free-form generation to enhance flexibility. For near-body clothing areas (blue), we apply LBS-based deformation, while unclothed regions (yellow) that do not require deformation can be directly replicated. (b) Visual comparison between prior arts (POP ma2021pop, FITE lin2022fite) and our method on challenging clothing. Our method captures more high-fidelity details and achieves superior visual quality and realism. Code is available at https://alvinyh.github.io/FreeCloth.
  • Figure 2: Overview of our hybrid framework FreeCloth. Given an unclothed and posed body, and a specific garment type, our goal is to create a realistic clothed human. We first segment the human parts into three different regions (\ref{['sec:cut']}): unclothed parts (yellow) need no deformation, deformed parts (blue), and generated parts (green). The hybrid framework comprises two essential modules: (1) an LBS-based local deformation network (\ref{['sec:deform']}) to obtain pose-dependent deformed points $\boldsymbol{X}^d$ that are close to the human body, and (2) a free-form generator that focuses on generating the more loose clothing regions $\boldsymbol{X}^g$ (\ref{['sec:hybrid']}). By merging the unclothed, deformed, and generated points, we ultimately obtain the complete point cloud of a clothed human $\boldsymbol{X}$.
  • Figure 3: Perceptual study results. Across all examples, $63.4\%$ of human users prefer our method over the baselines. Additionally, our model receives $56.0\%$ of the votes from the GPT-4o model achiam2023gpt4. These results highlight the significant superiority of our approach, particularly in handling the most challenging clothing.
  • Figure 4: Qualitative comparison between baselines and our method for modeling loose clothing. Subject IDs from top to bottom: "felice-004", "janett-025" and "christine-027". Best viewed zoomed-in on a color screen.
  • Figure 5: Visualization results of loose clothing. Our model effectively avoids redundant points on the open surface of loose skirts, a limitation in FITE lin2022fite, and generates more accurate geometry than POP ma2021pop and SkiRT ma2022skirt.
  • ...and 21 more figures