Table of Contents
Fetching ...

Deep learning for 3D human pose estimation and mesh recovery: A survey

Yang Liu, Changzhen Qiu, Zhiyong Zhang

TL;DR

The paper addresses the problem of reconstructing accurate 3D human pose and full-body meshes from visual data using deep learning. It synthesizes advances across single- and multi-person HPE, and explicit (parametric) and implicit mesh-recovery methods, outlining a comprehensive taxonomy and benchmarking landscape. Key contributions include a unified survey of over 200 references, structured coverage of sensors, representations, datasets, metrics, and applications, plus forward-looking directions and an updated project page. The work is significant for guiding researchers and practitioners by clarifying method trade-offs, data requirements, and practical pathways toward real-time, detailed human models in real-world applications.

Abstract

3D human pose estimation and mesh recovery have attracted widespread research interest in many areas, such as computer vision, autonomous driving, and robotics. Deep learning on 3D human pose estimation and mesh recovery has recently thrived, with numerous methods proposed to address different problems in this area. In this paper, to stimulate future research, we present a comprehensive review of recent progress over the past five years in deep learning methods for this area by delving into over 200 references. To the best of our knowledge, this survey is arguably the first to comprehensively cover deep learning methods for 3D human pose estimation, including both single-person and multi-person approaches, as well as human mesh recovery, encompassing methods based on explicit models and implicit representations. We also present comparative results on several publicly available datasets, together with insightful observations and inspiring future research directions. A regularly updated project page can be found at https://github.com/liuyangme/SOTA-3DHPE-HMR.

Deep learning for 3D human pose estimation and mesh recovery: A survey

TL;DR

The paper addresses the problem of reconstructing accurate 3D human pose and full-body meshes from visual data using deep learning. It synthesizes advances across single- and multi-person HPE, and explicit (parametric) and implicit mesh-recovery methods, outlining a comprehensive taxonomy and benchmarking landscape. Key contributions include a unified survey of over 200 references, structured coverage of sensors, representations, datasets, metrics, and applications, plus forward-looking directions and an updated project page. The work is significant for guiding researchers and practitioners by clarifying method trade-offs, data requirements, and practical pathways toward real-time, detailed human models in real-world applications.

Abstract

3D human pose estimation and mesh recovery have attracted widespread research interest in many areas, such as computer vision, autonomous driving, and robotics. Deep learning on 3D human pose estimation and mesh recovery has recently thrived, with numerous methods proposed to address different problems in this area. In this paper, to stimulate future research, we present a comprehensive review of recent progress over the past five years in deep learning methods for this area by delving into over 200 references. To the best of our knowledge, this survey is arguably the first to comprehensively cover deep learning methods for 3D human pose estimation, including both single-person and multi-person approaches, as well as human mesh recovery, encompassing methods based on explicit models and implicit representations. We also present comparative results on several publicly available datasets, together with insightful observations and inspiring future research directions. A regularly updated project page can be found at https://github.com/liuyangme/SOTA-3DHPE-HMR.
Paper Structure (28 sections, 3 equations, 8 figures, 6 tables)

This paper contains 28 sections, 3 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Recent research of deep learning for 3D HPE and HMR.
  • Figure 2: A taxonomy of deep learning methods for 3D HPE and HMR.
  • Figure 3: A basic framework of deep learning for 3D HPE and HMR.
  • Figure 4: Typical single person 3D human pose estimation. (a) The direct estimation method; (b) The 2D to 3D lifting method.
  • Figure 5: (a) Depth ambiguity; (b) Graph-based representation for human body; (c) Transfer learning.
  • ...and 3 more figures