Embodiment-Agnostic Navigation Policy Trained with Visual Demonstrations
Nimrod Curtis, Osher Azulay, Avishai Sintov
TL;DR
ViDEN introduces an embodiment-agnostic navigation framework trained from depth-based visual demonstrations to enable robust, collision-free pursuit of dynamic targets. By leveraging a diffusion-based behavior cloning policy and a compact depth-driven state representation, ViDEN achieves task-centric tracking without robot-specific topologies or pre-defined target RGB images. The approach demonstrates high data efficiency (≈1.5 hours of robot-independent demonstrations) and strong generalization, including zero-shot transfer and effective fine-tuning with modest additional data. This work offers practical, scalable routing for diverse robots in indoor and outdoor environments, with open-source code to benchmark and extend the methodology.
Abstract
Learning to navigate in unstructured environments is a challenging task for robots. While reinforcement learning can be effective, it often requires extensive data collection and can pose risk. Learning from expert demonstrations, on the other hand, offers a more efficient approach. However, many existing methods rely on specific robot embodiments, pre-specified target images and require large datasets. We propose the Visual Demonstration-based Embodiment-agnostic Navigation (ViDEN) framework, a novel framework that leverages visual demonstrations to train embodiment-agnostic navigation policies. ViDEN utilizes depth images to reduce input dimensionality and relies on relative target positions, making it more adaptable to diverse environments. By training a diffusion-based policy on task-centric and embodiment-agnostic demonstrations, ViDEN can generate collision-free and adaptive trajectories in real-time. Our experiments on human reaching and tracking demonstrate that ViDEN outperforms existing methods, requiring a small amount of data and achieving superior performance in various indoor and outdoor navigation scenarios. Project website: https://nimicurtis.github.io/ViDEN/.
