RoboPanoptes: The All-seeing Robot with Whole-body Dexterity
Xiaomeng Xu, Dominik Bauer, Shuran Song
TL;DR
RoboPanoptes tackles the limits of end-effector-centric manipulation by introducing whole-body dexterity powered by whole-body vision. It combines a modular, scalable hardware design with 21 cameras distributed over the body and a whole-body visuomotor policy based on diffusion transformers and cross-attention to learn manipulation skills from demonstrations. Key innovations include view-dependent positional encoding, blink training for sensor robustness, and a leader-follower teleoperation interface to collect diverse data. Empirical results across unboxing, sweeping, and stowing tasks show RoboPanoptes outperforms baselines in accuracy, efficiency, and resilience, suggesting strong practical potential for dexterous manipulation in cluttered or constrained environments.
Abstract
We present RoboPanoptes, a capable yet practical robot system that achieves whole-body dexterity through whole-body vision. Its whole-body dexterity allows the robot to utilize its entire body surface for manipulation, such as leveraging multiple contact points or navigating constrained spaces. Meanwhile, whole-body vision uses a camera system distributed over the robot's surface to provide comprehensive, multi-perspective visual feedback of its own and the environment's state. At its core, RoboPanoptes uses a whole-body visuomotor policy that learns complex manipulation skills directly from human demonstrations, efficiently aggregating information from the distributed cameras while maintaining resilience to sensor failures. Together, these design aspects unlock new capabilities and tasks, allowing RoboPanoptes to unbox in narrow spaces, sweep multiple or oversized objects, and succeed in multi-step stowing in cluttered environments, outperforming baselines in adaptability and efficiency. Results are best viewed on https://robopanoptes.github.io.
