Table of Contents
Fetching ...

PoseBench: Benchmarking the Robustness of Pose Estimation Models under Corruptions

Sihan Ma, Jing Zhang, Qiong Cao, Dacheng Tao

TL;DR

PoseBench is introduced, a comprehensive benchmark designed to evaluate the robustness of pose estimation models against real-world corruption and delve into various design considerations, including input resolution, pre-training datasets, backbone capacity, post-processing, and data augmentations to improve model robustness.

Abstract

Pose estimation aims to accurately identify anatomical keypoints in humans and animals using monocular images, which is crucial for various applications such as human-machine interaction, embodied AI, and autonomous driving. While current models show promising results, they are typically trained and tested on clean data, potentially overlooking the corruption during real-world deployment and thus posing safety risks in practical scenarios. To address this issue, we introduce PoseBench, a comprehensive benchmark designed to evaluate the robustness of pose estimation models against real-world corruption. We evaluated 60 representative models, including top-down, bottom-up, heatmap-based, regression-based, and classification-based methods, across three datasets for human and animal pose estimation. Our evaluation involves 10 types of corruption in four categories: 1) blur and noise, 2) compression and color loss, 3) severe lighting, and 4) masks. Our findings reveal that state-of-the-art models are vulnerable to common real-world corruptions and exhibit distinct behaviors when tackling human and animal pose estimation tasks. To improve model robustness, we delve into various design considerations, including input resolution, pre-training datasets, backbone capacity, post-processing, and data augmentations. We hope that our benchmark will serve as a foundation for advancing research in robust pose estimation. The benchmark and source code will be released at https://xymsh.github.io/PoseBench

PoseBench: Benchmarking the Robustness of Pose Estimation Models under Corruptions

TL;DR

PoseBench is introduced, a comprehensive benchmark designed to evaluate the robustness of pose estimation models against real-world corruption and delve into various design considerations, including input resolution, pre-training datasets, backbone capacity, post-processing, and data augmentations to improve model robustness.

Abstract

Pose estimation aims to accurately identify anatomical keypoints in humans and animals using monocular images, which is crucial for various applications such as human-machine interaction, embodied AI, and autonomous driving. While current models show promising results, they are typically trained and tested on clean data, potentially overlooking the corruption during real-world deployment and thus posing safety risks in practical scenarios. To address this issue, we introduce PoseBench, a comprehensive benchmark designed to evaluate the robustness of pose estimation models against real-world corruption. We evaluated 60 representative models, including top-down, bottom-up, heatmap-based, regression-based, and classification-based methods, across three datasets for human and animal pose estimation. Our evaluation involves 10 types of corruption in four categories: 1) blur and noise, 2) compression and color loss, 3) severe lighting, and 4) masks. Our findings reveal that state-of-the-art models are vulnerable to common real-world corruptions and exhibit distinct behaviors when tackling human and animal pose estimation tasks. To improve model robustness, we delve into various design considerations, including input resolution, pre-training datasets, backbone capacity, post-processing, and data augmentations. We hope that our benchmark will serve as a foundation for advancing research in robust pose estimation. The benchmark and source code will be released at https://xymsh.github.io/PoseBench
Paper Structure (37 sections, 1 equation, 7 figures, 18 tables)

This paper contains 37 sections, 1 equation, 7 figures, 18 tables.

Figures (7)

  • Figure 1: Visualization of the four corruption types examined in this paper.
  • Figure 2: Corruption severity level. Each corruption consists of five severity levels.
  • Figure 3: Pose estimation results in terms of mRR (%) for 10 representative models on the COCO-C, OCHuman-C, and AP10K-C datasets. Corruptions: #1 Motion Blur, #2 Gaussian Noise, #3 Impulse Noise, #4 Pixelate, #5 JPEG Compression, #6 Color Quant, #7 Brightness, #8 Darkness, #9 Contrast, #10 Mask.
  • Figure 4: The strong correlation between clean mAP and corruption robustness mRR (%).
  • Figure 5: Per-Severity Error Analysis: The mRR (%) results for 10 representative methods on the COCO-C dataset, with higher severity levels indicating greater degrees of corruption.
  • ...and 2 more figures