DHRNet: A Dual-Path Hierarchical Relation Network for Multi-Person Pose Estimation
Yonghao Dang, Jianqin Yin, Liyuan Liu, Pengxiang Ding, Yuan Sun, Yanzhu Hu
TL;DR
DHRNet tackles multi-person pose estimation by jointly modeling cross-instance and cross-joint interactions through a Dual-path Hierarchical Relation Network. The core is the Dual-path Interaction Modeling Module (DIM), which comprises cross-instance (CIM) and cross-joint (CJM) blocks and adaptive feature fusion (ADFMs) to produce robust instance- and joint-aware representations that feed a pose decoder. Empirical results across COCO, CrowdPose, and OCHuman demonstrate state-of-the-art performance among single-stage methods, with notable gains in occlusion-heavy and crowded scenes, while ablations validate the contribution of each DIM component. The approach offers a scalable, end-to-end framework that leverages complementary relational information to improve joint localization and pose estimation in challenging MPPE scenarios.
Abstract
Multi-person pose estimation (MPPE) presents a formidable yet crucial challenge in computer vision. Most existing methods predominantly concentrate on isolated interaction either between instances or joints, which is inadequate for scenarios demanding concurrent localization of both instances and joints. This paper introduces a novel CNN-based single-stage method, named Dual-path Hierarchical Relation Network (DHRNet), to extract instance-to-joint and joint-to-instance interactions concurrently. Specifically, we design a dual-path interaction modeling module (DIM) that strategically organizes cross-instance and cross-joint interaction modeling modules in two complementary orders, enriching interaction information by integrating merits from different correlation modeling branches. Notably, DHRNet excels in joint localization by leveraging information from other instances and joints. Extensive evaluations on challenging datasets, including COCO, CrowdPose, and OCHuman datasets, showcase DHRNet's state-of-the-art performance. The code will be released at https://github.com/YHDang/dhrnet-multi-pose-estimation.
