Table of Contents
Fetching ...

RAPID: Reconfigurable, Adaptive Platform for Iterative Design

Zi Yin, Fanhong Li, Shurui Zheng, Jia Liu

TL;DR

RAPID tackles the slow iteration cycle in multi-modal robotic manipulation by introducing a tool-free, modular hardware platform and a driver-level Physical Mask that derives modality presence from USB hot-plug events. This hardware-software co-design enables auto-configuration, graceful degradation, and fixed-dimension observations during dynamic reconfigurations, enabling systematic multi-modal ablations. System-centric evaluation shows roughly two orders of magnitude reduction in reconfiguration time and robustness to sensor hot-unplug events, with a diffusion-policy-based approach that leverages the Physical Mask for mask-aware inference. By logging modality availability alongside trajectories and leveraging a unified I/O stack, RAPID accelerates scientific inquiry and data curation for multimodal robotic learning, with open-source designs to encourage widespread adoption.

Abstract

Developing robotic manipulation policies is iterative and hypothesis-driven: researchers test tactile sensing, gripper geometries, and sensor placements through real-world data collection and training. Yet even minor end-effector changes often require mechanical refitting and system re-integration, slowing iteration. We present RAPID, a full-stack reconfigurable platform designed to reduce this friction. RAPID is built around a tool-free, modular hardware architecture that unifies handheld data collection and robot deployment, and a matching software stack that maintains real-time awareness of the underlying hardware configuration through a driver-level Physical Mask derived from USB events. This modular hardware architecture reduces reconfiguration to seconds and makes systematic multi-modal ablation studies practical, allowing researchers to sweep diverse gripper and sensing configurations without repeated system bring-up. The Physical Mask exposes modality presence as an explicit runtime signal, enabling auto-configuration and graceful degradation under sensor hot-plug events, so policies can continue executing when sensors are physically added or removed. System-centric experiments show that RAPID reduces the setup time for multi-modal configurations by two orders of magnitude compared to traditional workflows and preserves policy execution under runtime sensor hot-unplug events. The hardware designs, drivers, and software stack are open-sourced at https://rapid-kit.github.io/ .

RAPID: Reconfigurable, Adaptive Platform for Iterative Design

TL;DR

RAPID tackles the slow iteration cycle in multi-modal robotic manipulation by introducing a tool-free, modular hardware platform and a driver-level Physical Mask that derives modality presence from USB hot-plug events. This hardware-software co-design enables auto-configuration, graceful degradation, and fixed-dimension observations during dynamic reconfigurations, enabling systematic multi-modal ablations. System-centric evaluation shows roughly two orders of magnitude reduction in reconfiguration time and robustness to sensor hot-unplug events, with a diffusion-policy-based approach that leverages the Physical Mask for mask-aware inference. By logging modality availability alongside trajectories and leveraging a unified I/O stack, RAPID accelerates scientific inquiry and data curation for multimodal robotic learning, with open-source designs to encourage widespread adoption.

Abstract

Developing robotic manipulation policies is iterative and hypothesis-driven: researchers test tactile sensing, gripper geometries, and sensor placements through real-world data collection and training. Yet even minor end-effector changes often require mechanical refitting and system re-integration, slowing iteration. We present RAPID, a full-stack reconfigurable platform designed to reduce this friction. RAPID is built around a tool-free, modular hardware architecture that unifies handheld data collection and robot deployment, and a matching software stack that maintains real-time awareness of the underlying hardware configuration through a driver-level Physical Mask derived from USB events. This modular hardware architecture reduces reconfiguration to seconds and makes systematic multi-modal ablation studies practical, allowing researchers to sweep diverse gripper and sensing configurations without repeated system bring-up. The Physical Mask exposes modality presence as an explicit runtime signal, enabling auto-configuration and graceful degradation under sensor hot-plug events, so policies can continue executing when sensors are physically added or removed. System-centric experiments show that RAPID reduces the setup time for multi-modal configurations by two orders of magnitude compared to traditional workflows and preserves policy execution under runtime sensor hot-unplug events. The hardware designs, drivers, and software stack are open-sourced at https://rapid-kit.github.io/ .
Paper Structure (29 sections, 1 equation, 7 figures, 9 tables)

This paper contains 29 sections, 1 equation, 7 figures, 9 tables.

Figures (7)

  • Figure 1: RAPID system overview. RAPID is a reconfigurable robotic platform enabling tool-free, plug-and-play integration of sensors, hot-swappable end-effectors, and modular actuation. The same modular device supports both handheld data collection and robot-mounted deployment through rapid physical reconfiguration.
  • Figure 2: Runtime architecture of the RAPID system. The top panel shows asynchronous multimodal data streams during collection and inference, including robot end-effector state, vision, tactile sensing and optional other plug-in modalities.The bottom panel illustrates the event-driven pathway from hardware hot-plug events to the driver-generated Physical Mask, which is propagated through lightweight middleware to the application layer. The Physical Mask enables time alignment and zero filling of absent modalities, ensuring fixed-dimensional observations for data recording and mask-aware policy inference.
  • Figure 3: System implementation of RAPID. The discovery layer supports heterogeneous device interfaces (USB-CAN, USB-Serial, USB-LAN, and native USB) through a unified hub. The driver layer performs event-driven device registration and generates the Physical Mask via a virtual device file. The middleware layer publishes sensor streams and the mask over a lightweight publish-subscribe transport (ZeroMQ + Zeroconf). The synchronisation layer aligns multimodal observations within a configurable time window and zero-fills absent channels. The application layer consumes fixed-dimension observations in either collection mode (logging with per-frame mask) or inference mode (mask-aware policy execution).
  • Figure 4: Hardware layer details. (A) Tool-free connector design. (B) Rapid deployment to robot arm as end-effector.
  • Figure 5: Experiment task for evaluating runtime modality change. Blocks A and B appear identical to the wrist camera but differ in surface texture (triangular vs. hexagonal) perceivable through tactile sensing. The policy must grasp, identify, and place the triangular block to the left and the hexagonal block to the front. We test system behavior when the tactile sensor is unplugged during execution.
  • ...and 2 more figures