Multiple-Human Parsing in the Wild
Jianshu Li, Jian Zhao, Yunchao Wei, Congyan Lang, Yidong Li, Terence Sim, Shuicheng Yan, Jiashi Feng
TL;DR
<3-5 sentence high-level summary> The paper introduces the Multi-Human Parsing in the Wild (MHP) problem and provides a new large-scale dataset with pixel-level, instance-aware annotations for multiple people in realistic scenes. It proposes MH-Parser, a bottom-up model that jointly predicts a global instance-agnostic parsing map and a learned pairwise affinity graph, refined by a Graph-GAN and CRF-based post-processing to produce accurate multi-person parsing. Key innovations include Graph-GAN for learning high-order affinities on graphs of superpixels and a CRF refinement that integrates learned affinities with appearance and spatial cues. The method achieves competitive results with state-of-the-art baselines and demonstrates strong handling of closely entangled humans, establishing a solid baseline for future multi-human parsing research in the wild.
Abstract
Human parsing is attracting increasing research attention. In this work, we aim to push the frontier of human parsing by introducing the problem of multi-human parsing in the wild. Existing works on human parsing mainly tackle single-person scenarios, which deviates from real-world applications where multiple persons are present simultaneously with interaction and occlusion. To address the multi-human parsing problem, we introduce a new multi-human parsing (MHP) dataset and a novel multi-human parsing model named MH-Parser. The MHP dataset contains multiple persons captured in real-world scenes with pixel-level fine-grained semantic annotations in an instance-aware setting. The MH-Parser generates global parsing maps and person instance masks simultaneously in a bottom-up fashion with the help of a new Graph-GAN model. We envision that the MHP dataset will serve as a valuable data resource to develop new multi-human parsing models, and the MH-Parser offers a strong baseline to drive future research for multi-human parsing in the wild.
