SCALE: Self-Correcting Visual Navigation for Mobile Robots via Anti-Novelty Estimation
Chang Chen, Yuecheng Liu, Yuzheng Zhuang, Sitong Mao, Kaiwen Xue, Shunbo Zhou
TL;DR
SCALE tackles robust real-world visual navigation under offline learning by addressing OOD and localization failures. It combines image-goal navigation learned via offline Implicit Q-Learning with a self-supervised localization recovery module that imagines multi-step trajectories through a conditional affordance model, guided by anti-novelty via Random Network Distillation. The approach introduces a temporally informed prediction (GRU-based) to enable aggressive subgoal estimation and uses MPPI to optimize trajectories under a constrained cost that penalizes novelty and facilitates localization. Experiments in three outdoor urban scenarios demonstrate that SCALE with localization recovery significantly outperforms state-of-the-art baselines, reducing the need for human intervention and improving robustness to scenario changes. The work offers a practical path toward robust, GPS-denied navigation for mobile robots using only forward-facing vision and offline data.
Abstract
Although visual navigation has been extensively studied using deep reinforcement learning, online learning for real-world robots remains a challenging task. Recent work directly learned from offline dataset to achieve broader generalization in the real-world tasks, which, however, faces the out-of-distribution (OOD) issue and potential robot localization failures in a given map for unseen observation. This significantly drops the success rates and even induces collision. In this paper, we present a self-correcting visual navigation method, SCALE, that can autonomously prevent the robot from the OOD situations without human intervention. Specifically, we develop an image-goal conditioned offline reinforcement learning method based on implicit Q-learning (IQL). When facing OOD observation, our novel localization recovery method generates the potential future trajectories by learning from the navigation affordance, and estimates the future novelty via random network distillation (RND). A tailored cost function searches for the candidates with the least novelty that can lead the robot to the familiar places. We collect offline data and conduct evaluation experiments in three real-world urban scenarios. Experiment results show that SCALE outperforms the previous state-of-the-art methods for open-world navigation with a unique capability of localization recovery, significantly reducing the need for human intervention. Code is available at https://github.com/KubeEdge4Robotics/ScaleNav.
