FlexiCup: Wireless Multimodal Suction Cup with Dual-Zone Vision-Tactile Sensing
Junhao Gong, Shoujie Li, Kit-Wa Sou, Changqing Guo, Hourong Huang, Tong Wu, Yifan Xie, Chenxin Liang, Chuqiao Lyu, Xiaojun Liang, Wenbo Ding
TL;DR
FlexiCup addresses the sensing gap in traditional suction cups by delivering a fully wireless, dual-mode suction end effector with integrated vision-tactile sensing. The system uses illumination-controlled modality switching within a single optical path and supports both vacuum and Bernoulli mechanisms via modular bottom housings, enabling onboard perception, planning, and control. Key contributions include a modular hardware architecture, dual-zone sensing validated by multimodal recognition and modular grasping, and a diffusion-policy framework with multi-head attention for end-to-end contact-aware manipulation; results show high success across obstacle densities (vacuum 90.0% vs Bernoulli 86.7%) and effective end-to-end tasks (inclined transport 73.3%, orange extraction 66.7%). This work advances autonomous, contact-aware manipulation in unstructured settings and provides open hardware designs to accelerate research in multimodal suction manipulation.
Abstract
Conventional suction cups lack sensing capabilities for contact-aware manipulation in unstructured environments. This paper presents FlexiCup, a fully wireless multimodal suction cup that integrates dual-zone vision-tactile sensing. The central zone dynamically switches between vision and tactile modalities via illumination control for contact detection, while the peripheral zone provides continuous spatial awareness for approach planning. FlexiCup supports both vacuum and Bernoulli suction modes through modular mechanical configurations, achieving complete wireless autonomy with onboard computation and power. We validate hardware versatility through dual control paradigms. Modular perception-driven grasping across structured surfaces with varying obstacle densities demonstrates comparable performance between vacuum (90.0% mean success) and Bernoulli (86.7% mean success) modes. Diffusion-based end-to-end learning achieves 73.3% success on inclined transport and 66.7% on orange extraction tasks. Ablation studies confirm that multi-head attention coordinating dual-zone observations provides 13% improvements for contact-aware manipulation. Hardware designs and firmware are available at https://anonymous.4open.science/api/repo/FlexiCup-DA7D/file/index.html?v=8f531b44.
