Image-Based Virtual Try-On: A Survey
Dan Song, Xuanpu Zhang, Juan Zhou, Weizhi Nie, Ruofeng Tong, Mohan Kankanhalli, An-An Liu
TL;DR
This survey defines image-based virtual try-on as conditional person image generation conditioned on a target clothing image, and provides a taxonomy spanning pipeline types, cloth-agnostic person representations, and three core modules: try-on indication, cloth warping, and try-on synthesis. It introduces a comprehensive, unified evaluation framework including CLIP-based semantic scoring, standard metrics (SSIM, FID, LPIPS), and a cross-dataset protocol, and benchmarks representative methods on VITON-HD, complemented by a user study with 139 participants. The authors analyze trends across TPS, STN, flow-based, and implicit transformation warping, highlighting diffusion-based generation as yielding state-of-the-art results in many settings while noting persistent challenges in parsing dependency, pose handling, and controllability. They also outline unresolved issues and future directions, such as parser-free representations, diffusion-based controllable generation, multi-modal data integration, and the development of specialized datasets and metrics to drive industry-ready virtual try-on solutions.
Abstract
Image-based virtual try-on aims to synthesize a naturally dressed person image with a clothing image, which revolutionizes online shopping and inspires related topics within image generation, showing both research significance and commercial potential. However, there is a gap between current research progress and commercial applications and an absence of comprehensive overview of this field to accelerate the development.In this survey, we provide a comprehensive analysis of the state-of-the-art techniques and methodologies in aspects of pipeline architecture, person representation and key modules such as try-on indication, clothing warping and try-on stage. We additionally apply CLIP to assess the semantic alignment of try-on results, and evaluate representative methods with uniformly implemented evaluation metrics on the same dataset.In addition to quantitative and qualitative evaluation of current open-source methods, unresolved issues are highlighted and future research directions are prospected to identify key trends and inspire further exploration. The uniformly implemented evaluation metrics, dataset and collected methods will be made public available at https://github.com/little-misfit/Survey-Of-Virtual-Try-On.
