Table of Contents
Fetching ...

PBADet: A One-Stage Anchor-Free Approach for Part-Body Association

Zhongpai Gao, Huayi Zhou, Abhishek Sharma, Meng Zheng, Benjamin Planche, Terrence Chen, Ziyan Wu

TL;DR

PBADet addresses the challenging problem of part–body association in crowded scenes by introducing a novel one-stage anchor-free detector that uses a single 2D part–to–body center offset. The method extends dense per-pixel predictions with a dedicated offset $(m_i,n_i)$ to represent the body-center relation, enabling scalable associations for any number of parts while maintaining accuracy. Training leverages TOOD-inspired losses plus a specific $ ext{L}_{assoc}$ with an anchor-alignment metric, and decoding computes body centers to assign parts via nearest-body matching. Empirically, PBADet achieves state-of-the-art or competitive results on BodyHands and COCOHumanParts, with robust performance on crowds (CrowdHuman) and good generalization to animal datasets, demonstrating both effectiveness and efficiency for practical part–body understanding tasks.

Abstract

The detection of human parts (e.g., hands, face) and their correct association with individuals is an essential task, e.g., for ubiquitous human-machine interfaces and action recognition. Traditional methods often employ multi-stage processes, rely on cumbersome anchor-based systems, or do not scale well to larger part sets. This paper presents PBADet, a novel one-stage, anchor-free approach for part-body association detection. Building upon the anchor-free object representation across multi-scale feature maps, we introduce a singular part-to-body center offset that effectively encapsulates the relationship between parts and their parent bodies. Our design is inherently versatile and capable of managing multiple parts-to-body associations without compromising on detection accuracy or robustness. Comprehensive experiments on various datasets underscore the efficacy of our approach, which not only outperforms existing state-of-the-art techniques but also offers a more streamlined and efficient solution to the part-body association challenge.

PBADet: A One-Stage Anchor-Free Approach for Part-Body Association

TL;DR

PBADet addresses the challenging problem of part–body association in crowded scenes by introducing a novel one-stage anchor-free detector that uses a single 2D part–to–body center offset. The method extends dense per-pixel predictions with a dedicated offset to represent the body-center relation, enabling scalable associations for any number of parts while maintaining accuracy. Training leverages TOOD-inspired losses plus a specific with an anchor-alignment metric, and decoding computes body centers to assign parts via nearest-body matching. Empirically, PBADet achieves state-of-the-art or competitive results on BodyHands and COCOHumanParts, with robust performance on crowds (CrowdHuman) and good generalization to animal datasets, demonstrating both effectiveness and efficiency for practical part–body understanding tasks.

Abstract

The detection of human parts (e.g., hands, face) and their correct association with individuals is an essential task, e.g., for ubiquitous human-machine interfaces and action recognition. Traditional methods often employ multi-stage processes, rely on cumbersome anchor-based systems, or do not scale well to larger part sets. This paper presents PBADet, a novel one-stage, anchor-free approach for part-body association detection. Building upon the anchor-free object representation across multi-scale feature maps, we introduce a singular part-to-body center offset that effectively encapsulates the relationship between parts and their parent bodies. Our design is inherently versatile and capable of managing multiple parts-to-body associations without compromising on detection accuracy or robustness. Comprehensive experiments on various datasets underscore the efficacy of our approach, which not only outperforms existing state-of-the-art techniques but also offers a more streamlined and efficient solution to the part-body association challenge.
Paper Structure (28 sections, 6 equations, 6 figures, 9 tables)

This paper contains 28 sections, 6 equations, 6 figures, 9 tables.

Figures (6)

  • Figure 1: A comparative illustration between BPJDet zhou2023body and our PBADet in terms of the extended representations. Using the joint detection of three parts—head, left hand, and right hand—as an example, the original BPJDet's body-to-part configuration demands an extended representation of length 15 (6+K+2K, K=3). Due to unused positions in the part object representation, this approach can result in inefficiencies and redundancies. In contrast, our PBADet, operating on a part-to-body principle, adopts a more concise representation of length 10 (5+K+2, K=3), optimizing for greater utilization and efficiency.
  • Figure 2: Illustration of the proposed pipeline.
  • Figure 3: Qualitative results on BodyHands. Red arrows highlight our correct predictions.
  • Figure 4: Qualitative results on COCOHumanParts. Yellow and red arrows highlight other methods' failure cases and our correct predictions, respectively.
  • Figure 5: Qualitative comparison with ED-Pose yang2023explicit on images from COCO. Yellow circles highlight erroneous predictions.
  • ...and 1 more figures