Table of Contents
Fetching ...

Humanoid Locomotion and Manipulation: Current Progress and Challenges in Control, Planning, and Learning

Zhaoyuan Gu, Junheng Li, Wenlan Shen, Wenhao Yu, Zhaoming Xie, Stephen McCrory, Xianyi Cheng, Abdulaziz Shamsah, Robert Griffin, C. Karen Liu, Abderrahmane Kheddar, Xue Bin Peng, Yuke Zhu, Guanya Shi, Quan Nguyen, Gordon Cheng, Huijun Gao, Ye Zhao

TL;DR

<3-5 sentence high-level summary>This survey addresses the challenge of unifying locomotion and manipulation in humanoid robots by surveying both model-based planning/control and learning-based approaches, with emphasis on control, planning, and learning as well as tactile sensing and foundation models. It highlights the evolution from traditional model-based MPC/WBC frameworks to learning-enabled methods such as RL, IL, and diffusion-based policies, and discusses how foundation models may enable open-world reasoning and generalist humanoid agents. The paper examines multi-contact planning, whole-body control, and MPC speed-ups, and frames the transfer from simulation to real robots as a central bottleneck, proposing hybrid and data-efficient strategies. By contrasting strengths and limitations across paradigms and outlining practical benchmarks, the survey guides researchers toward integrated, robust loco-manipulation systems and points to future directions in hardware, sensing, and foundation-model integration.

Abstract

Humanoid robots hold great potential to perform various human-level skills, involving unified locomotion and manipulation in real-world settings. Driven by advances in machine learning and the strength of existing model-based approaches, these capabilities have progressed rapidly, but often separately. This survey offers a comprehensive overview of the state-of-the-art in humanoid locomotion and manipulation (HLM), with a focus on control, planning, and learning methods. We first review the model-based methods that have been the backbone of humanoid robotics for the past three decades. We discuss contact planning, motion planning, and whole-body control, highlighting the trade-offs between model fidelity and computational efficiency. Then the focus is shifted to examine emerging learning-based methods, with an emphasis on reinforcement and imitation learning that enhance the robustness and versatility of loco-manipulation skills. Furthermore, we assess the potential of integrating foundation models with humanoid embodiments to enable the development of generalist humanoid agents. This survey also highlights the emerging role of tactile sensing, particularly whole-body tactile feedback, as a crucial modality for handling contact-rich interactions. Finally, we compare the strengths and limitations of model-based and learning-based paradigms from multiple perspectives, such as robustness, computational efficiency, versatility, and generalizability, and suggest potential solutions to existing challenges.

Humanoid Locomotion and Manipulation: Current Progress and Challenges in Control, Planning, and Learning

TL;DR

<3-5 sentence high-level summary>This survey addresses the challenge of unifying locomotion and manipulation in humanoid robots by surveying both model-based planning/control and learning-based approaches, with emphasis on control, planning, and learning as well as tactile sensing and foundation models. It highlights the evolution from traditional model-based MPC/WBC frameworks to learning-enabled methods such as RL, IL, and diffusion-based policies, and discusses how foundation models may enable open-world reasoning and generalist humanoid agents. The paper examines multi-contact planning, whole-body control, and MPC speed-ups, and frames the transfer from simulation to real robots as a central bottleneck, proposing hybrid and data-efficient strategies. By contrasting strengths and limitations across paradigms and outlining practical benchmarks, the survey guides researchers toward integrated, robust loco-manipulation systems and points to future directions in hardware, sensing, and foundation-model integration.

Abstract

Humanoid robots hold great potential to perform various human-level skills, involving unified locomotion and manipulation in real-world settings. Driven by advances in machine learning and the strength of existing model-based approaches, these capabilities have progressed rapidly, but often separately. This survey offers a comprehensive overview of the state-of-the-art in humanoid locomotion and manipulation (HLM), with a focus on control, planning, and learning methods. We first review the model-based methods that have been the backbone of humanoid robotics for the past three decades. We discuss contact planning, motion planning, and whole-body control, highlighting the trade-offs between model fidelity and computational efficiency. Then the focus is shifted to examine emerging learning-based methods, with an emphasis on reinforcement and imitation learning that enhance the robustness and versatility of loco-manipulation skills. Furthermore, we assess the potential of integrating foundation models with humanoid embodiments to enable the development of generalist humanoid agents. This survey also highlights the emerging role of tactile sensing, particularly whole-body tactile feedback, as a crucial modality for handling contact-rich interactions. Finally, we compare the strengths and limitations of model-based and learning-based paradigms from multiple perspectives, such as robustness, computational efficiency, versatility, and generalizability, and suggest potential solutions to existing challenges.
Paper Structure (71 sections, 4 equations, 17 figures, 6 tables)

This paper contains 71 sections, 4 equations, 17 figures, 6 tables.

Figures (17)

  • Figure 1: Humanoids executing locomotion and manipulation tasks: (a) HRP-4 wipes a wood board while adapting to terrain humanoid_shuffle; (b-g) Object pick and place by Digit, Hector li2023dynamic, Atlas, H1, TOROhenze2016passivity, and Apollo; (h) iCub pushes a cart DS_iCub; (i) Nadia opens a door Nadia_door_auto; (j-k) Object manipulation by Figure 02 and Optimus; (l) MIT humanoid whole-body push recovery khazoom2024tailoring.
  • Figure 2: This survey begins by defining relevant concepts of humanoid robots and their locomotion and manipulation capabilities. Centered around achieving humanoid loco-manipulation tasks, the core of this survey delves into two main categories of methods: the traditional planning and control approaches, such as contact planning, motion planning, and control, as well as the emerging learning-based approaches, including skill learning and foundation models. In addition, this survey highlights whole-body tactile sensing as a crucial modality to achieve contact-rich loco-manipulation.
  • Figure 3: (a) Whole-body manipulation exemplified by human and humanoid Justin WAM_Justin interacting with objects using all surfaces. (c) Loco-manipulation involves simultaneous locomotion and manipulation, as shown in the collaborative tasks performed by humans and a humanoid agravante2019human. (b) Whole-body loco-manipulation is an intersection of (a) and (c), as exemplified by a human and a humanoid HRP-4 Murooka_heavy_push pushing heavy objects using their legs and arms.
  • Figure 4: Tactile sensing on humanoid robots, exemplified by (a) REEM-C fully covered with artificial skin cheng2019robotskin (image copyright: A. Eckert), which cover three body regions: hand, feet, and the whole body. (i) Hand tactile sensors demonstrated by (b) Shadow-Dexterous-Hand equipped with tactile sensors on palm and fingertips melnik2021using, (c) Allegro Hand equipped with DIGIT sensors lambeta2020digit, and (d) BioTac veiga2020hierarchical tactile sensors for dexterous manipulation; (ii) Tactile sensors on foot soles for (e) obstacle recognition guadarrama2018enhancing and (f) terrain classification xiaofeng_foot; (iii) Whole-body tactile sensors for (g) whole-body manipulation by Punyo-1 Punyo and whole-body human-robot interaction by (h) iCub maiolino2013flexible and (i) REEM-C dean2019wholetactile_dance
  • Figure 5: An illustration of a task sequence for loco-manipulation planning in humanoid robots, involving carrying and placing a box and pushing a cart. The planning techniques explored include (a) multi-contact trajectory planning and (b) whole-body pose planning, highlighting their contact and state planning strategies. Additionally, the pros and cons of categorized approaches in (i) sampling-based, (ii) optimization-based, and (iii) learning-based methods are summarized.
  • ...and 12 more figures