Progressive Limb-Aware Virtual Try-On

Xiaoyu Han; Shengping Zhang; Qinglin Liu; Zonglin Li; Chenyang Wang

Progressive Limb-Aware Virtual Try-On

Xiaoyu Han, Shengping Zhang, Qinglin Liu, Zonglin Li, Chenyang Wang

TL;DR

PL-VTON addresses the fidelity and limb-texture challenges of image-based virtual try-on by proposing a progressive, limb-aware framework. It introduces three integrated components: MCW for two-stage, multi-attribute clothing warping; HPE to provide a semantically informed parsing map that constrains garment placement and limb textures; and LTF to fuse textures in a coarse-to-fine manner with explicit limb guidance. The method achieves state-of-the-art performance on VITON, improving both qualitative realism and quantitative metrics such as FID, and demonstrates robustness across cross-category clothing changes like long- to short-sleeve transformations. This work advances practical virtual try-on by preserving limb details and enabling more accurate garment geometry, with clear implications for e-commerce and fashion editing.

Abstract

Existing image-based virtual try-on methods directly transfer specific clothing to a human image without utilizing clothing attributes to refine the transferred clothing geometry and textures, which causes incomplete and blurred clothing appearances. In addition, these methods usually mask the limb textures of the input for the clothing-agnostic person representation, which results in inaccurate predictions for human limb regions (i.e., the exposed arm skin), especially when transforming between long-sleeved and short-sleeved garments. To address these problems, we present a progressive virtual try-on framework, named PL-VTON, which performs pixel-level clothing warping based on multiple attributes of clothing and embeds explicit limb-aware features to generate photo-realistic try-on results. Specifically, we design a Multi-attribute Clothing Warping (MCW) module that adopts a two-stage alignment strategy based on multiple attributes to progressively estimate pixel-level clothing displacements. A Human Parsing Estimator (HPE) is then introduced to semantically divide the person into various regions, which provides structural constraints on the human body and therefore alleviates texture bleeding between clothing and limb regions. Finally, we propose a Limb-aware Texture Fusion (LTF) module to estimate high-quality details in limb regions by fusing textures of the clothing and the human body with the guidance of explicit limb-aware features. Extensive experiments demonstrate that our proposed method outperforms the state-of-the-art virtual try-on methods both qualitatively and quantitatively. The code is available at https://github.com/xyhanHIT/PL-VTON.

Progressive Limb-Aware Virtual Try-On

TL;DR

Abstract

Progressive Limb-Aware Virtual Try-On

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)