Towards Label-Efficient Human Matting: A Simple Baseline for Weakly Semi-Supervised Trimap-Free Human Matting
Beomyoung Kim, Myeong Yeon Yi, Joonsang Yu, Young Joon Yoo, Sung Ju Hwang
TL;DR
This work tackles the high annotation cost of matte labels in human matting and the domain generalization gap of trimap-free models trained on synthetic data. It introduces Weakly Semi-Supervised Human Matting (WSSHM) and a simple yet effective Matte Label Blending (MLB) strategy within a two-stage teacher-student framework: a teacher learns boundary fidelity from synthetic matte data, and a student learns robust matte prediction from natural segmentation data guided by MLB, which blends teacher boundaries with coarse segmentation. The approach yields strong improvements in real-world robustness and boundary detail with modest matte data, while enabling real-time performance on lightweight backbones and transferability across multiple matting architectures. Overall, MLB demonstrates a practical path to label-efficient matting with strong domain generalization and boundary quality, pointing to future work leveraging unlabeled data through pseudo-labeling.
Abstract
This paper presents a new practical training method for human matting, which demands delicate pixel-level human region identification and significantly laborious annotations. To reduce the annotation cost, most existing matting approaches often rely on image synthesis to augment the dataset. However, the unnaturalness of synthesized training images brings in a new domain generalization challenge for natural images. To address this challenge, we introduce a new learning paradigm, weakly semi-supervised human matting (WSSHM), which leverages a small amount of expensive matte labels and a large amount of budget-friendly segmentation labels, to save the annotation cost and resolve the domain generalization problem. To achieve the goal of WSSHM, we propose a simple and effective training method, named Matte Label Blending (MLB), that selectively guides only the beneficial knowledge of the segmentation and matte data to the matting model. Extensive experiments with our detailed analysis demonstrate our method can substantially improve the robustness of the matting model using a few matte data and numerous segmentation data. Our training method is also easily applicable to real-time models, achieving competitive accuracy with breakneck inference speed (328 FPS on NVIDIA V100 GPU). The implementation code is available at \url{https://github.com/clovaai/WSSHM}.
