Skip-SCAR: Hardware-Friendly High-Quality Embodied Visual Navigation
Yaotian Liu, Yu Cao, Jeff Zhang
TL;DR
Skip-SCAR addresses the computational bottlenecks of ObjectNav by integrating an adaptive skip semantic mapping module with a SparseConv-Augmented ResNet (SCAR) for target probability prediction. The adaptive mapping enables lossless and aggressive skips to bypass redundant semantic segmentation and replanning, while SCAR drastically reduces memory and FLOPs relative to dense predictors. Evaluations on HM3D ObjectNav and real hardware show Skip-SCAR achieving state-of-the-art navigation quality with large speedups and memory savings, largely due to the SYNERGY of adaptive skipping and sparse-convolution-based prediction. Overall, the approach demonstrates that jointly optimizing navigation performance and computational efficiency can yield practical, scalable robotic navigation systems.
Abstract
In ObjectNav, agents must locate specific objects within unseen environments, requiring effective perception, prediction, localization and planning capabilities. This study finds that state-of-the-art embodied AI agents compete for higher navigation quality, but often compromise the computational efficiency. To address this issue, we introduce "Skip-SCAR," an optimization framework that builds computationally and memory-efficient embodied AI agents to accomplish high-quality visual navigation tasks. Skip-SCAR opportunistically skips the redundant step computations during semantic segmentation and local re-planning without hurting the navigation quality. Skip-SCAR also adopts a novel hybrid sparse and dense network for object prediction, optimizing both the computation and memory footprint. Tested on the HM3D ObjectNav datasets and real-world physical hardware systems, Skip-SCAR not only minimizes hardware resources but also sets new performance benchmarks, demonstrating the benefits of optimizing both navigation quality and computational efficiency for robotics.
