Vision-Guided Loco-Manipulation with a Snake Robot
Adarsh Salagame, Sasank Potluri, Keshav Bharadwaj Vaidyanathan, Kruthika Gangaraju, Eric Sihite, Milad Ramezani, Alireza Ramezani
TL;DR
This work addresses autonomous loco-manipulation for a highly articulated snake robot (COBRA) by fusing onboard vision and control to detect, estimate the pose, and manipulate objects at several meters distance. The authors implement a vision-guided pipeline that combines on-board YOLOv8 Nano-based detection, depth-assisted $6$-DOF pose estimation, and a COM-based path-tracking framework optimized via nonlinear programming to enable closed-loop docking and object transport. Key contributions include end-to-end integration of perception and control on COBRA, an evaluation comparing YOLOv8 Nano and Mask2Former for docking-module detection, and a demonstration of real-time onboard detection, pose estimation, and open-loop loco-manipulation. The results illustrate the feasibility of autonomous manipulation in confined and challenging environments, with future work focusing on perception robustness, feedback control integration, and adaptation to dynamic or cluttered settings.
Abstract
This paper presents the development and integration of a vision-guided loco-manipulation pipeline for Northeastern University's snake robot, COBRA. The system leverages a YOLOv8-based object detection model and depth data from an onboard stereo camera to estimate the 6-DOF pose of target objects in real time. We introduce a framework for autonomous detection and control, enabling closed-loop loco-manipulation for transporting objects to specified goal locations. Additionally, we demonstrate open-loop experiments in which COBRA successfully performs real-time object detection and loco-manipulation tasks.
