Deep learning for 3D human pose estimation and mesh recovery: A survey

Yang Liu; Changzhen Qiu; Zhiyong Zhang

Deep learning for 3D human pose estimation and mesh recovery: A survey

Yang Liu, Changzhen Qiu, Zhiyong Zhang

TL;DR

The paper addresses the problem of reconstructing accurate 3D human pose and full-body meshes from visual data using deep learning. It synthesizes advances across single- and multi-person HPE, and explicit (parametric) and implicit mesh-recovery methods, outlining a comprehensive taxonomy and benchmarking landscape. Key contributions include a unified survey of over 200 references, structured coverage of sensors, representations, datasets, metrics, and applications, plus forward-looking directions and an updated project page. The work is significant for guiding researchers and practitioners by clarifying method trade-offs, data requirements, and practical pathways toward real-time, detailed human models in real-world applications.

Abstract

3D human pose estimation and mesh recovery have attracted widespread research interest in many areas, such as computer vision, autonomous driving, and robotics. Deep learning on 3D human pose estimation and mesh recovery has recently thrived, with numerous methods proposed to address different problems in this area. In this paper, to stimulate future research, we present a comprehensive review of recent progress over the past five years in deep learning methods for this area by delving into over 200 references. To the best of our knowledge, this survey is arguably the first to comprehensively cover deep learning methods for 3D human pose estimation, including both single-person and multi-person approaches, as well as human mesh recovery, encompassing methods based on explicit models and implicit representations. We also present comparative results on several publicly available datasets, together with insightful observations and inspiring future research directions. A regularly updated project page can be found at https://github.com/liuyangme/SOTA-3DHPE-HMR.

Deep learning for 3D human pose estimation and mesh recovery: A survey

TL;DR

Abstract

Paper Structure (28 sections, 3 equations, 8 figures, 6 tables)

This paper contains 28 sections, 3 equations, 8 figures, 6 tables.

Introduction
Motivation
Scope of this survey
Background
Sensors used for 3D HPE and HMR
Active sensors
Passive sensors
Representation for human body
Overview of Deep Learning for 3D HPE and HMR
3D Human Pose Estimation
Single person 3D pose estimation
Single person 3D pose estimation in images
Single person 3D pose estimation in videos
Multi-person 3D pose estimation
Top-down methods
...and 13 more sections

Figures (8)

Figure 1: Recent research of deep learning for 3D HPE and HMR.
Figure 2: A taxonomy of deep learning methods for 3D HPE and HMR.
Figure 3: A basic framework of deep learning for 3D HPE and HMR.
Figure 4: Typical single person 3D human pose estimation. (a) The direct estimation method; (b) The 2D to 3D lifting method.
Figure 5: (a) Depth ambiguity; (b) Graph-based representation for human body; (c) Transfer learning.
...and 3 more figures

Deep learning for 3D human pose estimation and mesh recovery: A survey

TL;DR

Abstract

Deep learning for 3D human pose estimation and mesh recovery: A survey

Authors

TL;DR

Abstract

Table of Contents

Figures (8)