Hierarchical end-to-end autonomous navigation through few-shot waypoint detection
Amin Ghafourian, Zhongying CuiZhu, Debo Shi, Ian Chuang, Francois Charette, Rithik Sachdeva, Iman Soltani
TL;DR
This work tackles autonomous navigation under limited localization by introducing a Description-based Navigation System (DNS) that relies on a hierarchical end-to-end architecture and few-shot landmark detection. It formulates a distribution-embedding, metric-based approach to recognize waypoint landmarks from minimal examples and triggers a low-level maneuver controller via a lookup-based high-level action. The key contributions are the two-stage DNS framework, a novel distribution-based few-shot learning method using mean/covariance embeddings and a distribution-to-distribution distance, and empirical validation on unseen indoor routes with ablation studies showing the impact of backbone pretraining and metric choice. The approach promises data-efficient, adaptable navigation with reduced reliance on precise localization, demonstrated on a small-scale vehicle in indoor environments.
Abstract
Human navigation is facilitated through the association of actions with landmarks, tapping into our ability to recognize salient features in our environment. Consequently, navigational instructions for humans can be extremely concise, such as short verbal descriptions, indicating a small memory requirement and no reliance on complex and overly accurate navigation tools. Conversely, current autonomous navigation schemes rely on accurate positioning devices and algorithms as well as extensive streams of sensory data collected from the environment. Inspired by this human capability and motivated by the associated technological gap, in this work we propose a hierarchical end-to-end meta-learning scheme that enables a mobile robot to navigate in a previously unknown environment upon presentation of only a few sample images of a set of landmarks along with their corresponding high-level navigation actions. This dramatically simplifies the wayfinding process and enables easy adoption to new environments. For few-shot waypoint detection, we implement a metric-based few-shot learning technique through distribution embedding. Waypoint detection triggers the multi-task low-level maneuver controller module to execute the corresponding high-level navigation action. We demonstrate the effectiveness of the scheme using a small-scale autonomous vehicle on novel indoor navigation tasks in several previously unseen environments.
