Beyond Imperfections: A Conditional Inpainting Approach for End-to-End Artifact Removal in VTON and Pose Transfer
Aref Tabatabaei, Zahra Dehghanian, Maryam Amirmazlaghani
TL;DR
The paper addresses artifacts that degrade the realism of VTON and pose transfer outputs. It proposes a conditional inpainting framework built on Stable Diffusion, guided by ControlNet and IP-Adapter, and augmented with automatic artifact detection and multi-modal conditioning. It introduces two task-driven datasets, DDI and VDI, with artifact masks and references for robust evaluation. Experimental results show quantitative gains in standard image-quality metrics (e.g., SSIM, LPIPS, FID) and strong qualitative judgments from human evaluators, indicating cleaner, more realistic renderings. The method provides an end-to-end solution and public resources to advance artifact removal in VTON and pose transfer.
Abstract
Artifacts often degrade the visual quality of virtual try-on (VTON) and pose transfer applications, impacting user experience. This study introduces a novel conditional inpainting technique designed to detect and remove such distortions, improving image aesthetics. Our work is the first to present an end-to-end framework addressing this specific issue, and we developed a specialized dataset of artifacts in VTON and pose transfer tasks, complete with masks highlighting the affected areas. Experimental results show that our method not only effectively removes artifacts but also significantly enhances the visual quality of the final images, setting a new benchmark in computer vision and image processing.
