CostNav: A Navigation Benchmark for Cost-Aware Evaluation of Embodied Agents
Haebin Seong, Sungmin Kim, Minchan Kim, Yongjun Cho, Myunchul Joe, Suhwan Choi, Jaeyoon Jung, Jiyong Youn, Yoonshik Kim, Samwoo Seong, Yubeen Park, Youngjae Yu, Yunsung Lee
TL;DR
CostNav introduces a cost-aware navigation benchmark that translates navigation performance into economic outcomes by modeling the full lifecycle of autonomous delivery systems, including upfront hardware/training costs, per-run energy and maintenance, revenue with SLA adjustments, and break-even analysis. The framework is instantiated in a high-fidelity simulator (Isaac Lab) with the COCO delivery robot and evaluates a learning-based on-device baseline under Level 1–2 urban sidewalk scenarios, revealing a 43% SLA and a -$30.009 profit per run, with maintenance driving almost all costs. These results demonstrate a substantial gap between current navigation performance and commercial viability, highlighting collision avoidance and SLA improvements as primary levers for profitability. By providing cost-aware metrics and an economic evaluation workflow, CostNav enables apples-to-apples comparisons across navigation paradigms and supports data-driven deployment decisions for real-world embodied AI systems.
Abstract
Existing navigation benchmarks focus on task success metrics while overlooking economic viability -- critical for commercial deployment of autonomous delivery robots. We introduce \emph{CostNav}, a \textbf{Micro-Navigation Economic Testbed} that evaluates embodied agents through comprehensive cost-revenue analysis aligned with real-world business operations. CostNav models the complete economic lifecycle including hardware, training, energy, maintenance costs, and delivery revenue with service-level agreements, using industry-derived parameters. \textbf{To our knowledge, CostNav is the first work to quantitatively expose the gap between navigation research metrics and commercial viability}, revealing that optimizing for task success fundamentally differs from optimizing for economic deployment. Our cost model uses parameters derived from industry data sources (energy rates, delivery service pricing), and we project from a reduced-scale simulation to realistic deliveries. Under this projection, the baseline achieves 43.0\% SLA compliance but is \emph{not} commercially viable: yielding a loss of \$30.009 per run with no finite break-even point, because operating costs are dominated by collision-induced maintenance, which accounts for 99.7\% of per-run costs and highlights collision avoidance as a key optimization target. We demonstrate a learning-based on-device navigation baseline and establish a foundation for evaluating rule-based navigation, imitation learning, and cost-aware RL training. CostNav bridges the gap between navigation research and commercial deployment, enabling data-driven decisions about economic trade-offs across navigation paradigms.
