Table of Contents
Fetching ...

SkillBlender: Towards Versatile Humanoid Whole-Body Loco-Manipulation via Skill Blending

Yuxuan Kuang, Haoran Geng, Amine Elhafsi, Tan-Dzung Do, Pieter Abbeel, Jitendra Malik, Marco Pavone, Yue Wang

TL;DR

SkillBlender tackles versatile humanoid loco-manipulation by pretraining a library of goal-conditioned primitive skills and dynamically blending them with a high-level controller. The approach reduces task-specific reward engineering and is evaluated on SkillBench, a parallel cross-embodiment benchmark designed to measure both task accuracy and motion feasibility. Across extensive simulations, SkillBlender outperforms baselines in accuracy and naturalness of motion while mitigating reward hacking, and its components are validated through targeted ablations. The work provides open-source resources to accelerate research in humanoid learning and emphasizes cross-embodiment evaluation for scalable, real-world applicability.

Abstract

Humanoid robots hold significant potential in accomplishing daily tasks across diverse environments thanks to their flexibility and human-like morphology. Recent works have made significant progress in humanoid whole-body control and loco-manipulation leveraging optimal control or reinforcement learning. However, these methods require tedious task-specific tuning for each task to achieve satisfactory behaviors, limiting their versatility and scalability to diverse tasks in daily scenarios. To that end, we introduce SkillBlender, a novel hierarchical reinforcement learning framework for versatile humanoid loco-manipulation. SkillBlender first pretrains goal-conditioned task-agnostic primitive skills, and then dynamically blends these skills to accomplish complex loco-manipulation tasks with minimal task-specific reward engineering. We also introduce SkillBench, a parallel, cross-embodiment, and diverse simulated benchmark containing three embodiments, four primitive skills, and eight challenging loco-manipulation tasks, accompanied by a set of scientific evaluation metrics balancing accuracy and feasibility. Extensive simulated experiments show that our method significantly outperforms all baselines, while naturally regularizing behaviors to avoid reward hacking, resulting in more accurate and feasible movements for diverse loco-manipulation tasks in our daily scenarios. Our code and benchmark will be open-sourced to the community to facilitate future research. Project page: https://usc-gvl.github.io/SkillBlender-web/.

SkillBlender: Towards Versatile Humanoid Whole-Body Loco-Manipulation via Skill Blending

TL;DR

SkillBlender tackles versatile humanoid loco-manipulation by pretraining a library of goal-conditioned primitive skills and dynamically blending them with a high-level controller. The approach reduces task-specific reward engineering and is evaluated on SkillBench, a parallel cross-embodiment benchmark designed to measure both task accuracy and motion feasibility. Across extensive simulations, SkillBlender outperforms baselines in accuracy and naturalness of motion while mitigating reward hacking, and its components are validated through targeted ablations. The work provides open-source resources to accelerate research in humanoid learning and emphasizes cross-embodiment evaluation for scalable, real-world applicability.

Abstract

Humanoid robots hold significant potential in accomplishing daily tasks across diverse environments thanks to their flexibility and human-like morphology. Recent works have made significant progress in humanoid whole-body control and loco-manipulation leveraging optimal control or reinforcement learning. However, these methods require tedious task-specific tuning for each task to achieve satisfactory behaviors, limiting their versatility and scalability to diverse tasks in daily scenarios. To that end, we introduce SkillBlender, a novel hierarchical reinforcement learning framework for versatile humanoid loco-manipulation. SkillBlender first pretrains goal-conditioned task-agnostic primitive skills, and then dynamically blends these skills to accomplish complex loco-manipulation tasks with minimal task-specific reward engineering. We also introduce SkillBench, a parallel, cross-embodiment, and diverse simulated benchmark containing three embodiments, four primitive skills, and eight challenging loco-manipulation tasks, accompanied by a set of scientific evaluation metrics balancing accuracy and feasibility. Extensive simulated experiments show that our method significantly outperforms all baselines, while naturally regularizing behaviors to avoid reward hacking, resulting in more accurate and feasible movements for diverse loco-manipulation tasks in our daily scenarios. Our code and benchmark will be open-sourced to the community to facilitate future research. Project page: https://usc-gvl.github.io/SkillBlender-web/.

Paper Structure

This paper contains 51 sections, 11 equations, 9 figures, 11 tables.

Figures (9)

  • Figure 1: SkillBlender performs versatile autonomous humanoid loco-manipulation tasks within different embodiments and environments, given only one or two intuitive reward terms.
  • Figure 2: Overview of SkillBlender. We first pretrain goal-conditioned primitive expert skills that are task-agnostic, reusable, and physically interpretable, and then reuse and blend these skills to achieve complex whole-body loco-manipulation tasks given only one or two task-specific reward terms.
  • Figure 3: Our SkillBench is a parallel, cross-embodiment, and diverse simulated benchmark containing three embodiments, four primitive skills, and eight loco-manipulation tasks.
  • Figure 4: Qualitative comparison between different methods. Our SkillBlender not only achieves higher task accuracy, but also avoids reward hacking and yields more natural and feasible movements.
  • Figure 5: An example of GPT-4o reasoning to perform skill selection on the FarReach task.
  • ...and 4 more figures