Table of Contents
Fetching ...

BEHAVIOR Robot Suite: Streamlining Real-World Whole-Body Manipulation for Everyday Household Activities

Yunfan Jiang, Ruohan Zhang, Josiah Wong, Chen Wang, Yanjie Ze, Hang Yin, Cem Gokmen, Shuran Song, Jiajun Wu, Li Fei-Fei

TL;DR

BEHAVIOR Robot Suite (BRS) unifies a low-cost, whole-body teleoperation interface (JoyLo) with a novel autoregressive, multi-modal visuomotor policy (WB-VIMA) to tackle real-world household manipulation. JoyLo enables scalable data collection on a dual-arm, torso-equipped mobile robot, while WB-VIMA learns coordinated base-torso-arm actions through hierarchical conditioning and causal attention over multi-modal observations. Across five real-world tasks, WB-VIMA delivers high sub-task success, strong end-to-end performance, emergent long-horizon capabilities, and near-zero safety violations, outperforming baselines and even human teleoperation on certain subtasks. The work is open-sourced, offering practical paths toward robust, real-world autonomous household robotics.

Abstract

Real-world household tasks present significant challenges for mobile manipulation robots. An analysis of existing robotics benchmarks reveals that successful task performance hinges on three key whole-body control capabilities: bimanual coordination, stable and precise navigation, and extensive end-effector reachability. Achieving these capabilities requires careful hardware design, but the resulting system complexity further complicates visuomotor policy learning. To address these challenges, we introduce the BEHAVIOR Robot Suite (BRS), a comprehensive framework for whole-body manipulation in diverse household tasks. Built on a bimanual, wheeled robot with a 4-DoF torso, BRS integrates a cost-effective whole-body teleoperation interface for data collection and a novel algorithm for learning whole-body visuomotor policies. We evaluate BRS on five challenging household tasks that not only emphasize the three core capabilities but also introduce additional complexities, such as long-range navigation, interaction with articulated and deformable objects, and manipulation in confined spaces. We believe that BRS's integrated robotic embodiment, data collection interface, and learning framework mark a significant step toward enabling real-world whole-body manipulation for everyday household tasks. BRS is open-sourced at https://behavior-robot-suite.github.io/

BEHAVIOR Robot Suite: Streamlining Real-World Whole-Body Manipulation for Everyday Household Activities

TL;DR

BEHAVIOR Robot Suite (BRS) unifies a low-cost, whole-body teleoperation interface (JoyLo) with a novel autoregressive, multi-modal visuomotor policy (WB-VIMA) to tackle real-world household manipulation. JoyLo enables scalable data collection on a dual-arm, torso-equipped mobile robot, while WB-VIMA learns coordinated base-torso-arm actions through hierarchical conditioning and causal attention over multi-modal observations. Across five real-world tasks, WB-VIMA delivers high sub-task success, strong end-to-end performance, emergent long-horizon capabilities, and near-zero safety violations, outperforming baselines and even human teleoperation on certain subtasks. The work is open-sourced, offering practical paths toward robust, real-world autonomous household robotics.

Abstract

Real-world household tasks present significant challenges for mobile manipulation robots. An analysis of existing robotics benchmarks reveals that successful task performance hinges on three key whole-body control capabilities: bimanual coordination, stable and precise navigation, and extensive end-effector reachability. Achieving these capabilities requires careful hardware design, but the resulting system complexity further complicates visuomotor policy learning. To address these challenges, we introduce the BEHAVIOR Robot Suite (BRS), a comprehensive framework for whole-body manipulation in diverse household tasks. Built on a bimanual, wheeled robot with a 4-DoF torso, BRS integrates a cost-effective whole-body teleoperation interface for data collection and a novel algorithm for learning whole-body visuomotor policies. We evaluate BRS on five challenging household tasks that not only emphasize the three core capabilities but also introduce additional complexities, such as long-range navigation, interaction with articulated and deformable objects, and manipulation in confined spaces. We believe that BRS's integrated robotic embodiment, data collection interface, and learning framework mark a significant step toward enabling real-world whole-body manipulation for everyday household tasks. BRS is open-sourced at https://behavior-robot-suite.github.io/

Paper Structure

This paper contains 50 sections, 2 equations, 16 figures, 14 tables.

Figures (16)

  • Figure 1: Everyday household activities enabled by BEHAVIOR Robot Suite (BRS), showcasing its three core capabilities: bimanual coordination ( B), stable and accurate navigation ( N), and extensive end-effector reachability ( R).
  • Figure 2: Ecological distributions of task-relevant objects in daily household activities. Multiple distinct modes appear in the vertical distance distribution, located at 0.09m, 0.49m, 0.94m, and 1.43m, representing heights at which objects are typically found.
  • Figure 3: BRS hardware system.Left: The R1 robot with two 6-DoF arms and a 4-DoF torso mounted on an omnidirectional mobile base. Right: The JoyLo system, consisting of compact, off-the-shelf Nintendo Joy-Con controllers mounted at the ends of two kinematic-twin arms. Joy-Con serves as the interface for controlling the grippers, torso, and mobile base.
  • Figure 4: WB-VIMA architecture. It autoregressively decodes whole-body actions by leveraging the hierarchical interdependencies within the embodiment space, and dynamically aggregates multi-modal observations using self-attention.
  • Figure 5: Evaluation results for five household tasks.Left: Initial randomization. Middle: Success rates over 15 runs ("ET" = entire task, "ST" = sub-task). Right: Number of safety violations.
  • ...and 11 more figures