High-Fidelity Virtual Try-on with Large-Scale Unpaired Learning

Han Yang; Yanlong Zang; Ziwei Liu

High-Fidelity Virtual Try-on with Large-Scale Unpaired Learning

Han Yang, Yanlong Zang, Ziwei Liu

TL;DR

Boosted Virtual Try-on shows great generalizability and scalability to various dressing styles and data sources, and an unpaired try-on synthesizer by constructing pseudo training pairs with randomly misaligned on-model clothes, where intricate skin texture and clothes boundaries can be generated.

Abstract

Virtual try-on (VTON) transfers a target clothing image to a reference person, where clothing fidelity is a key requirement for downstream e-commerce applications. However, existing VTON methods still fall short in high-fidelity try-on due to the conflict between the high diversity of dressing styles (\eg clothes occluded by pants or distorted by posture) and the limited paired data for training. In this work, we propose a novel framework \textbf{Boosted Virtual Try-on (BVTON)} to leverage the large-scale unpaired learning for high-fidelity try-on. Our key insight is that pseudo try-on pairs can be reliably constructed from vastly available fashion images. Specifically, \textbf{1)} we first propose a compositional canonicalizing flow that maps on-model clothes into pseudo in-shop clothes, dubbed canonical proxy. Each clothing part (sleeves, torso) is reversely deformed into an in-shop-like shape to compositionally construct the canonical proxy. \textbf{2)} Next, we design a layered mask generation module that generates accurate semantic layout by training on canonical proxy. We replace the in-shop clothes used in conventional pipelines with the derived canonical proxy to boost the training process. \textbf{3)} Finally, we propose an unpaired try-on synthesizer by constructing pseudo training pairs with randomly misaligned on-model clothes, where intricate skin texture and clothes boundaries can be generated. Extensive experiments on high-resolution ($1024\times768$) datasets demonstrate the superiority of our approach over state-of-the-art methods both qualitatively and quantitatively. Notably, BVTON shows great generalizability and scalability to various dressing styles and data sources.

High-Fidelity Virtual Try-on with Large-Scale Unpaired Learning

TL;DR

Abstract

High-Fidelity Virtual Try-on with Large-Scale Unpaired Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (19)